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(54)TiUe: REST PROTEIN AND DNA 
(57) Abstract 

The invention provides a substantially pure nucleic acid encoding a protein that inhibits the expression of neural P^^^^^J" "°";"^! 
tissues mS^^^^^ pure nucleic acid encoding a protein that biiKis to a promoter fl^^^;;^^^^^^^ 

ZTsot homology to nucleotides 6-28 of the^REl sequence and actmg to suppress the acuw^^ ^^^^S^^^T^^^^ 
sMuence Th«s invention further provides a substantially pure nucleic acid encodmg a protem havmg at least about 85 % homology lo ai 
ST^oi^^ DN^^^^ suppressor domain of an animal REl-Silencing Transcription factor. 71.e mvention also relates 

to the proteins so encoded. 
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REST PROTEIN AND DNA 

The present invention is directed to purified nucleic acids encoding REl-Silencing 
Transcription factors ("REST proteins") and to purified proteins with REST activity. 

Part of the work performed during the development of this invention utilized United States 
Government Funds under National Instimtes of Health Grant NS22518 and National Science 
Foundation Grant GER9023237. The government has ceruin rights in this invention. 

It has been suggested that neural development is substantially a default pathway of 
developmem that is repressed in non-neural cell types. Consistem with this idea. Kraner et al.. 
mron 9. 37^. 1992, identitTed a DNA sequence. ths~28 base pair (-bp-) P^^ 
in the 5' flanking sequence of the gene for the membrane protein that forms the CNS-type voltage 
10 dependem sodium channel (i.e., "type D" voltage dependent sodium channel), that appears to be 
responsible for negatively regulating the use of this gene in non-neural tissue. REl nucleic acid 
sequences also appear to imeract with a nuclear protein found in non-neural calls but not in most 
neural cells. Similar sequences having cell-specific silencer activity have been identified in the 
promoters for SCGIO (Mori et al.. Neuron 9. 45-54, 1992). synapsin (Li et al.. Proc. Nail. Acad. 
15 Sd. USA 90, 1460-1464. 1993) and dopamine /3-hydroxylase Gshigoro et al.. J. Biol. Oiem. 268. 
17987-17994. 1993). 
Summary of the Invention 

Until now. however, the protein responsible for silencing promoters conuining REl 
elements has not been identified. That protein herein referred to as "REST." and the gene encoding 
20 it. is herein idemified as having the amino acid sequence included in SEQ ID NO: 1 . The portion 
of the nucleic acid sequence included in SEQ ID N0:1 that is an open reading frame for REST is 
identified as SEQ ID NO:10. The protein sequence for human REST and the nucleic acid sequence 
of the CDNA for human REST are shown in Figure 1. 

One preferred embodiment of the present invemion is a substantially pure nucleic acid 
25 comprising a nucleic acid encoding a protein having at least about 85% homology to at least the 
DNA binding domain or the suppressor domain of an animal REST protein; the same substantially 
pure nucleic acid further comprising a nucleic acid encoding at least the DNA binding domain or 
the suppressor domain of an animal REST protein; the same substantially pure nucleic acid, 
wherein the REST protein is a mammalian REST protein; the same substantially pure nucleic acid. 
30 wherein the REST protein is a human REST protein; the same substantially pure nucleic acid. 
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whcrein the nucleic acid comprises SEQ ID N0:2: the same substantially pure nucleic acid, wherein 
the nucleic acid comprises SEQ ID NO: 10; the same substantially pure nucleic acid, further 
conq)rising a nucleic acid encoding both the DNA binding domain and the suppressor domain of an 
animal REST protein; the same substantially pure nucleic acid, wherein the REST protein is a 
5 mammalian REST protein; the same substantially pure nucleic acid, wherein the REST protein is a 
human REST protein; the same substantially pure nucleic acid, wherein the nucleic acid comprises 
SEQ ID N0:2; the same substantially pure nucleic acid, wherein the nucleic acid comprises SEQ 
ID NO: 10; the same substantially pure nucleic acid, cony)rising a nucleic acid encoding a protein 
differing from an animal REST protein by no more than about 20 point mutations. Preferred 
10 substaniialiy pure nucleic-acids-alsG encode analogs to the REST protein, which include either the 
DNA binding domain or the suppressor domain thereof. 

Another preferred embodiment of the present invention is a substantially pure nucleic acid 
that hybridizes with an animal REST nucleic acid under stringent conditions; the same substantially 
pure nucleic acid, comprising the nucleic acid of SEQ ID NO:l. 
15 A further preferred embodiment is a substantially pure nucleic acid comprising a nucleic 

acid encoding a protein that binds to a promoter having at least about 90% homology lo nucleotides 
6-28 of SEQ ID NO:29 and acting to suppress the activity of a promoter having said promoter. 

Yet another preferred embodiment is a substantially pure protein having at least about 85% 
homology with at least the DNA binding domain or the suppressor domain of an animal REST 
20 protein; the same substantially pure protein, comprising at least the DNA binding domain or the 
suppressor domain of an animal REST protein; the same substantially pure protein, further 
comprising the protein of SEQ ID N0:2; the same substantially pure protein, further comprising 
both the DNA binding domain and the suppressor domain of an animal REST protein; the same 
substantially pure protein, further comprising the protein of SEQ ID NO: 10. 
25 Yet another preferred embodiment is a transformed eukaryotic or prokaryotic cell 

comprising a nucleic acid encoding a protein having at least about 85% homology to at least one of 
the DNA binding domain or the suppressor domain of an animal REST protein; the same 
transformed cell, further comprising a nucleic acid encoding at least the DNA binding domain or 
the suppressor domain of an animal REST protein; the same transformed cell, wherein the REST 
30 protein is a mammalian REST protein; the same transformed cell, wherein the REST protein is a 
human REST protein; the same uansfonned cell, wherein the nucleic acid comprises SEQ ID NO: 
2. Preferably, the transformed cell expresses one of the invemive proteins described herein. 
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Yet another preferred embodiment is a vector capable of reproducing in a eukaryotic or 
prokaryoiic cell con^)rising a nucleic acid encoding a protein having at least about 85% homology 
to at least the DNA binding domain or the suppressor domain of an animal REST protein; the same 
vector capable of reproducing in a eukaryotic or prokaryotic cell, further comprising a nucleic acid 
encoding at least the DNA binding domain or the suppressor domain of an animal REST protein; 
the same vector capable of reproducing in a eukaryotic or prokaryotic cell, wherein the REST 
pr6tein is a mammalian REST protein; the same vector capable of reproducing in a eukaryotic or 
prokaryotic cell, wherein the REST protein is a human REST protein; the same vector capable of 
reproducing in a eukaryotic or prokaryotic cell, wherein the nucleic acid comprises SEQ ID N0:2. 
Prcfcfably. the inventive vector expresses, -intracellul^ or extracellularly. one of the inventive 

proteins described herein. 
10 Yet another preferred embodiment is a method of preparing a protein having REST activity, 
wherein the protein has at least about 85% homology with at least the DNA binding domain or the 
suppressor domain of an anunal REST protein, the method comprising: 

(a) transforming an appropriate eukaryotic or prokaryotic cell with an expression 
vector for expressing intracellularly or extracellularly a nucleic acid encoding the protein; 
15 (b) growing the transformed cell in culnire; and 

(c) isolating the protein from the transformed cell or the culnire medium. 
Yet another preferred embodiment is a pharmaceutical composition for treating an animal 
having de-differentiated neural cells or neural cells exhibiting diminished activity comprising an 
effective amount of a REST-interfering nucleic acid, wherein the REST-inierfering nucleic acid 
cSfiiprises an antisense molecule directed against REST expression or an expression vector for 
expressing REST DNA binding activity but not REST silencer activity, and a pharmaceutically 
acceptable carrier; the same pharmaceutical composition, wherein the animal has brain cancer; the 
same pharmaceutical composition, wfierein said animal has a demyelinating myasthenia gravis, 
muscular dystrophy, bomlism. peripheral neuropathies, traumatic nerve injury, post stroke 
dageneration, post-traumatic spinal and neural degeneration, poliomyelitis or rabies. 

Yet another preferred embodiment is a pharmaceutical composition for an animal having 
neural cells exhibiting excessive neural activity comprising an effective amount of an expression 
vector comprising a nucleic acid encoding a protein that inhibits the expression of neural proteins in 
non-neural tissues, and a pharmaceutically acceptable carrier; the same pharmaceutical composition, 
vaftrein the animal has epilepsy, Unnox-Gastaut syndrome, spasticity, trauma-induced pain, 
schizophrenia, stroke or a neurodegenerative disease; the same pharmaceutical composition. 
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Wherein the animal has Alzheimer's. Parkinson's or Huntington's disease; the same pharmaceutical 
composition, whereto the animal has epilepsy; the same pharmaceutical composition, wherem the 
animal has a neurodegenerative disease. 

Yet another preferred embodiment is a method of determining the level of REST expression 

mi tissue sample comprismg 

(a) contacting the tissue sample with (i) a nucleic acid that binds to REST mRNA 

under stringent conditions or (ii) an anUbody specific for REST; 

(b) washing the tissue sample to remove non-specific hybridizations of the nucleic 

acid or non-specific antibody binding; and 
10 (c) deteniiimngihe-ievclorhybridized-r/JcleicaGidor.M^^ 

Yet another preferred embodiment is an antibody that reacts specifically with the 
substantially pure protem havmg at least about 85% homology with at least the DNA btoding 
domain or the suppressor domam of an animal REST protein, as recited above. 

aSef Description of th e Drawings 

Figure 1 shows the proteto encoded by the open reading frame of SEQ ID NO:l and the 

nucleotide sequence of SEQ ID N0:1. 
nttailed Desr ription of the invention 

20 The DNA binding domam of REST is made up of eight zinc finger domains. The portion 
of SEQ ID NO:l that encompasses the eight zinc finger domains of REST is identified as SEQ ID 
NO-2 The underlined residues shown m Figure 1 are the zmc finger domains. A search of the 
GenBank daubase found that the closest homology for this DNA bmdmg domain is found with the 
Kruppel family of repressor proteins, particularly the GLl-Knippel repressor protein. (For a review 
6liinc finger proteins, see Colman. Afin. Re.. BiocHem. 61. 897-946. 1992.) The size of the REl 
sequence. 28 bp. and the number of zinc finder domains m REST is consistem with research 
(Pauletich and Pabo. Science 242. 809-817. 1991) that suggests that each such zinc finger domam 
interacts with a triplet of nucleotide base pairs. 
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The sequences of the zinc finger domains are indicated in the table below (with a space 
inserted into 6 of the 8 sequences to facilitate alignment of homologous sequence): 



SEQ 
©NO. 


Zinc Finger Sequence 




■ 










11 


CKPCQYEAE 


SEE 


Q F V H H 1 


R 


V 


H 




12 


CDRCGYNTN 


R Y D 


H Y T A H L 


K 


H 


H 




13 


CI 1 C T Y T T V 


S E Y 


H W R K H 


L 


R N 




H 


14 


CGKCNYFSD 


R K N 


N Y V Q H V 


R 


T 


H 




10 15 


CELCPYSSS 


Q K T 


H L T R H M 




R T 




H 


16 


CDQCSYVAS 


N Q H 


E V T R H A 


R 


Q V 


H 




17 


CPHCDYKTA 


D R S 


N F K K H V 


E 


L 


H 




18 


CPVCDYAAS 


K K C 


N L Q Y H F 


K 


S K 


H 





15 C-terminal to the DNA binding domain, REST has six repeat sequences having the 
following sequences: 



SEQ 
ID NO. 


Interna] Homologous Sequences 


20 21 


MMM:^M EGPAiQKjEli. L PjlP 


22 




23 




24 


Mmm/mmm^^ k i iEoiSi^iiip 


25 




25 26 


M G yai^rdiji^ftf $P A Q R E P P P^iP 



These sequences are indicated in Figure 1 by the double underlined amino acid residues. The 
sequence encompassing these repeats is designed SEQ ID NO:20. The most highly conserved 
residues of the six repeats are highlighted in the uble above. 

30 By studying the activity of the RE 1 promoter, it has been determined that REST is 
expressed in undifferentiated neural progenitors, which is consistent with the view that REST plays 
a role in maintaining the undifferentiated state of these cells. Aniisense oligonucleotides directed 
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against the REST transcript accordingly, would promote the differentiated state. Also consistent 
with this view is the hypothesis that certain neuroblastoma cells have de-differentiated into analogs 
of neural progenitors. Accordingly, REST antisense therapy aides in reversing this de- 
differentiation and reducing or reversing the malignancy of these cells. 

5 As used herein, a "REST nucleic acid" means the REST-encoding nucleic acid, whether 
RNA or DNA, synthetic or namral, found in a REST-expressing ammal, or the complementary 
strand thereof. "REST protein-encoding nucleic acid" or "nucleic acid encoding a REST protein" 
refers to any nucleic acid, whether native or synthetic, RNA, DNA. or cDNA, that encodes a 
REST protein. For recombinant expression purposes, codon usage preferences for the organism in 
viuch such a nucleic acid is-to-be-expressed-ar^ in desjgnirig a synthetic 

REST protein-encoding nucleic acid. A "REST protein" is a REST homologous protein with the 
ability to bind an REl sequence and to repress the activity of a promoter containing an REl 
sequence. An "animal REST protein" is a REST protein expressed by a member of the animal 
kingdom; a "human REST protein" is a REST protein expressed by a human. 
15 Vectors encoding a protein with REl-binding activity but not suppressor activity are shown 
herein to reverse the transcriptional suppression caused by REST, apparently by competing for the 
REl promoter element through which REST functions. Accordingly, gene therapy with such 
vectors are used like the aforementioned and other antisense therapies known in the art to reduce 
REST'S suppressor activity. The vectors described in this paragraph and the antisense molecules 
dZftussed above are termed herein "REST-interfering nucleic acids." 

Probes for REST expression are used to measure the extent of a de-differentiation in biopsy 
tissue from nmiors that are derived from neural tissue. Such probes are used to predict the extent 
of tissue transformation and the virulence of the nmior. Such probes include antibodies directed 
against REST or fragments thereof, nucleic acid probes that hybridize to REST mRNA under 
saSngcnt conditions, and oligonucleotides that specifically prime a PCR amplification of REST 
mRNA. 

For a number of years physicians have sought to treat neurodegenerative diseases by 
administering neural stem cells, for instance stem cells derived from embryos, to produce 
replacements for a patient's lost neural cells. Such diseases include Alzheimer's disease, 
PSBkinson's disease, Huntington's disease, amyotrophic lateral sclerosis ("Lou Gehrig's disease^) 
and demyelinating diseases such as multiple sclerosis. Stem cells used in these therapies are 
induced to initiate differentiation to provide the needed replacemem cells by treating them with 
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REST antisense constructs or with vectors expressing the DN A-binding domain of REST but not 
the suppressor function of REST. 

In diseases where pathological states are associated with excesses in neural activity, such as 
epilepsy. Lennox-Gastaut syndrome, spasticity, trauma-induced pain, schizophrenia, stroke and 
nefirodegenerative diseases (including Alzheimer^s, Parkinson's and Huntington's diseases), the 
level of neural expression of the voltage-dependent sodium channel is usefully reduced. Toward 
this end, neural cells are transformed to express sufficient REST to down-regulate expression of the 
sodium channel. 

In diseases that exhibit insufficient neural activity, such as demyelinaiing diseases (including 
nJfltiple sclerosis), myasthema-graYi$,„muscular dystrophy,J^^ 

traumatic nerve injury, post-stroke degeneration, post-traumatic spinal cord neural degeneration, 
poliomyelitis and rabies, up regulation of the expression of the neural voltage-dependent sodium 
channel is useful. This up regulation is done by antisense therapy based on REST nucleic acids to 
inhibit neural expression of REST or with gene therapy using a vector that expresses a protein that 
c*fiipetes with REST for REl promoter sequences without suppressing the activity of the promoter. 

The REST protein is also a useful target for dmg screening efforts to identify drugs that 
interfere with its suppressor activity, either by inhibiting DNA binding or the negative effect of 
REST on transcription. Such drug screening assays in one embodiment include cell-free 
transcription systems using the REST protein, cell-free transcription systems such as those described 
hJODignam et al.. Nuci Acids. Res. 11, 1475-1489, 1983 or that described in the cell-free 
transcription protocol available from Promega (Madison, WI) in an appropriate REl -containing 
promoter. The screening methods also utilize in other embodiments expression snadies conducted in 
cell culture, such as the chloran^henicol acetyl transferase (CAT) assay methods described herein 
below. 

25 The suppression domain of REST is fused by recombinant methods to a DNA-binding 
domain of a positive transcription factor to create a protein that represses the activity of one or 
more promoters. For instance, in one embodiment the suppressor domain is linked to pit-1, a 
transcription factor for the prolactin and growth hormone promoters (see Ingraham ct al.. Cell 55, 
519-529, 1988), thereby creating a vector for gene therapeutics aimed at down regulating 
h^eractive piniitary production of growth hormone and/or prolactin. Other examples of specific 
urgcts for this kind of therapy are the DNA-binding domains of steroid hormone or thyroid 
hormone receptors. Fusion vectors expressing a DNA binding domain from a steroid hormone 
receptor and the REST suppressor domain are used in yet other embodiments to down regulate 
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responsivencss to the steroid hormones in patients that overproduce the steroid or that have steroid 
hormone receptors that are too active. The fusion protein in one embodiment includes the target 
DNA-binding element and substantially all of the REST protein. 

The antibodies and nucleic acid probes of the present invention are also useful as 
hiflocheraical reagents for marking the pathways of nerves that do not express the CNS-type sodium 
channel. Also, the staining of most non-neural tissue serves as a contrast agent to highlight neurons 
that do not express REST or express very low levels of REST. Thus, these histochemical agents 
are used to produce histochemical slides and preserved anatomy specimens useful for training 
students and physicians. 

10 Thcfirst embodimeiTtof Jhe invention relates to a purified nucleic acid comprising a nucleic 
acid having at least 85% homology to at least the DNA binding domain or the suppressor domain 
of an animal REST protein. Such a nucleic acid is referred to herein as a REST protein that binds 
the REl promoter element and/or suppresses the activity of the promoter for the CNS-type voltage- 
dependent sodium channel. The encoded protein is preferably a REST protein of a mammalian 
aifinal. more preferably the human REST protein. Preferably, the encoded protein has the 
sequence of SEQ ID N0:1. SEQ ID NO:2. or SEQ ID NO. 10. 

Another embodiment of the invention provides for one or more nucleic acids encoding a 
protein that binds to a promoter sequence having at least about 90% homology, preferably 95% 
homology, to nucleotides 6-28 the REl sequence (SEQ ID NO:29) and acting to suppress the 
aaOvity of a promoter containing that promoter sequence. Yet another embodiment provides for a 
nucleic acid encoding a protein that inhibits the expression of neural proteins in non-neural tissues. 

The nucleic acid embodiments of the invention are preferably deoxyribonucleic acids, 
preferably double-stranded deoxyribonucleic acids, except that, for hybridization probes, single- 
stranded nucleic acids are preferred. However, nucleic acids of the presem invemion also include 
riUbnucleic acids. The nucleic acids of the present invention are also referred to as polynucleotides 
or polynucleic acids. 

Numerous mediods are known to delete a segment of a nucleic acid from or mutate a 
nucleic acid that encodes a protein and to confirm the fimction of the proteins encoded by these 
deleted or mutated nucleic acids. Accordingly, the invention also relates to a mutated or deleted 
va&ion of a REST protein-encoding nucleic acid that encodes a protein that retains the ability to 
bind specifically to the REl promoter elemem and/or the ability to suppress an REl -responsive 
promoter when appropriately bound to the vicinity of the promoter. 
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The invention also relates to a nucleic acid encoding, in the proper order, at least 4 of the 
zinc finger domains of a REST protein, preferably at least 6 of the zinc finger domains, more 
preferably all of the zinc finger domains. The zinc finger domains for human REST are identified 
in Figure 2. Preferably, the nucleic acid is SEQ ID N0:2. 

5 Transcription suppressive proteins, such as Kriippel. Kid-1, and ZNF2 generally have 
distinct suppressor domains which function so long as they are appropriately linked to DNA binding 
domains that suitably bring the suppressor domains into the vicinity of the target promoters. See, 
for instance. Licht et al.. Nature 346, 76-79. 1990; Witzgall ct al.. Proc. Natl. Acad. Sd. USA 91. 
4514-4518, 1994. Such a suppressor domain can readily be identified for the REST protein using 
ufeGctional apprGaehes-and-recombinant-fusion.protein approaches that are well known in the art. 
Accordingly, the invention also is directed to a nucleic acid encoding a segment of the protein of a 
REST protein that is effective to repress the use of a promoter when attached to a protein that binds 
the promoter. Preferably, the encoded protein will be effective to repress the use of the promoter 
for the CNS-type voltage-dependent sodium channel gene. Smdies with the aforementioned REl 
nliSleic acid suggest that it is ineffective as a transcription silencing element when inserted into 
some gene promoters. Accordingly, the promoters discussed in reference to this embodiment are 
REl -responsive promoters. 

It is recognized that many deletional or mutational analogs of nucleic acid sequences for a 
REST protein are effective hybridization probes for REST nucleic acid. Accordingly, the invention 
rmtes to nucleic acid sequences that hybridize with such REST-encoding sequences under stringent 
conditions. Preferably, the nucleic acid of the present invention hybridizes with SEQ ID N0:1 
under stringent conditions. The invention also relates to nucleic acids that hybridize with SEQ ID 
N0:2 under such stringent conditions. 

"Stringent conditions" refers to conditions that allow for the hybridization of substantially 
rflited nucleic acids, where relatedness is a function of the sequence of nucleotides in the respective 
nucleic acids. For instance, for a nucleic acid of 100 nucleotides, such conditions will generally 
allow hybridization thereto of a second nucleic acid having at least about 85% homology, 
preferably having at least about 90% homology. Such hybridization conditions are described by 
Sambrook et al.. Molecular Cloning: A Laboratory Manual, 2nd ed.. Cold Spring Harbor Press. 
1969. 

The invention further relates to REST proteins and to proteins having sufficient zinc finger 
domains to confer the ability to bind the REl promoter element. Preferably, the protein has at least 
4 of the zinc finger domains REST, more preferably at least 6. yet more preferably at least 7. Still 
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more preferably, the REl binding protein has all of the zinc fmger domains. Preferably, the 
protein has the sequence of a contiguous stretch of at least about 252 amino acids of SEQ ID N0:1, 
more preferably, of a contiguous stretch of at least about 504 amino acids. 

As discussed above, deletional or mutational methods of producing recombinant proteins 
th* retain a given activity are well known. Thus, the embodiments of the present invention that 
relate to proteins also encompass analogs of REST proteins that retain one or both of the ability to 
bind the REl promoter element and to suppress the activity of a promoter to which the protein is 
bound. These analogs preferably lack no more than about 360 amino acid residues of deleted 
sequence at the C-terminal or N-teiminal ends, more preferably no more than about 180 amino acid 
rdfldues of-deleted sequence. The remaining jsequence of the REST protein will preferably have no 
more than about 20 point mutations, preferably no more than about 10 point mutations, more 
preferably no more than about 5 point muutions. The pomt mutations are preferably conservative 
point mutations. Preferably, the analogs will have at least about 85% homology, preferably at least 
about 90% homology, more preferably at least about 95% homology to a portion of an animal 
RBST protein retaining one or both of REST's known activities, such as the proteins of SEQ ID 
N0:1 or SEQ ID N0:2. 

Antigens for eliciting the production of antibodies against the REST protein can be 
produced recombinantly by expressing all of or a part of the nucleic acid of a REST protein in a 
baaeria or a yeast or other eukaryotic cell line. In one embodiemnt, the recombinant protein is 
cSpressed as a fusion protein, with the non-REST portion of the protein serving either to facilitate 
purification or to enhance the immunogenicity of the fusion protem. For instance, the non-REST 
portion comprises a protein for which there is a readily-available binding partner that is utilized for 
affinity purification of the fusion protein. The antigen includes an "antigenic determinant." i.e.. a 
minimum segment of amino acids sufficient to bind specifically with an anti-REST antibody. 
25 Rules for designing PGR pririiers are well known in the art, as reviewed by PGR Protocols. 
Cold Spring Harbor Press. 1991 . Degenerate primers, i.e., preparations of primers that are 
heterogeneous at given sequence locations, are designed to amplify nucleic acid sequences that are 
highly related to. but not idemical to. a REST protein. For instance, such degenerate primers, in 
one embodiment, are designed from the human REST cDNA and used to amplify nucleic acid 
sS^ences for REST proteins from non-human species, as illustrated in the examples. 

The method by which human REST cDNA was isolated, which is described in detail in the 
examples, illustrates how readily REl-binding domains from REST proteins are identified. In the 
isolation method, a library was made of cDNA from a REST-expressing cell and inserted into a 
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yeast expression vector for the GAIA activation domain so that the library would express fusion 
proteins having one part derived from cDNA and another part that is the GAM activation domain. 
Initial partial cDNA clones were identified by their ability to bind an REl element on the promoters 
for two reporter genes and activate expression of those genes by causing the fused GAL4 activation 
do&iain to act on the promoters. These initial clones were of portions of the REl binding domain 
of the human REST protein. The same methodology can be used to identify other sequences from 
other animal sources that are sufficient to bind the REl element. 

Additionally, the mutational and deletional methodologies that are well known in the art are 
applied to nucleic acids having the sequence of SEQ ID NO:2, which encodes the zinc finger 
dfc&iarn-of human-REST. Nucleicacid constructs that express such mutated or deleted zinc finger 
domains are tested for the REl binding activity of the expressed protein. One facile method of 
doing this is to sub-clone the constructs into the GAL4 vector discussed above. Successful 
constructs activate the two REl -containing reporter genes that were used in the initial cloning of 
human REST cDNA. 

15 For identifying the suppressor domain of REST, one approach is to take a REST cDNA 
and create deletional mutants lacking segments at either the 5' or the 3* end by, for instance, partial 
digestion with SI nuclease, Bal 31 or Mung Bean nuclease (the latter approach described in 
literamre available from Straugene, San Diego. CA, in connection with a commercial deletion 
cloning kit). Alternatively, the deletion mutants are constructed by subcloning restriction fragments 
oIQi REST cDNA. The deletional constructs are cloned into expression vectors and tested for their 
ability to suppress the expression of a promoter that has a functional REl element. For instance, a 
reponer construct having the promoter for the CNS-type voltage-dependent sodium channel linked 
to the gene for chloramphenicol acetyl transferase ("CAT") is used. Such a vector is described 
below in the examples. Functional constructs diminish the level of expression of CAT, an enzyme 
tfifl is readily measurable by well established techniques. See, for example, Gorman et al., Mol. 
Cell. Bid 2, 1044-1051, 1982 and Young et al., DNA 4, 469-475. 1985. 

Mutational and deletional approaches are applied to all of the nucleic acid sequences of the 
invention that express REST-related proteins. As discussed above, conservative mutations are 
preferred. Such conservative mutations include mutations that switch one amino acid for another 

vSQiin one of the following groups: 

1. Small aliphatic, nonpolar or slightly polar residues: Ala, Ser, Thr. Pro and Gly; 

2. Polar, negatively charged residues and their amides: Asp, Asn, Glu and Gin; 

3. Polar, positively charged residues: His, Arg and Lys; 

4. Large aliphatic, nonpolar residues: Met, Leu, He. Val and Cys; and 
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5. Aromatic residues: Phe, Tyr and Trp. 
A preferred listing of conservative substitutions is the following: 



Original Residue 


Substitution 


Ala 


Gly, Ser 


Arg 


Lys 


Asn 


Gin, His 


Asp 


Glu 


Cys 


Ser 




. _ A cn 


Glu 


Asp 


Gly 


Ala, Pro 


His 


Asn, Gin 


He 


Leu. Val 


Leu 


lie, Val 


Lys 


Arg, Gin, Glu 


Met 


Leu, Tyr, He 


Phe 


Met, Leu, Tyr 


Ser 


Thr 


Thr 


Ser 


Trp 


Tyr 


Tyr 


Trp, Phe 


Val 


lie, Uu 



The types of substitutions selected may be based on the analysis of the frequencies of amino 
25 acid substinitions between homologous proteins of different species developed by Schulz et al.. 
Principles of Protein Structure, Springer- Verlag, 1978. pp. 14-16. on the analyses of structure- 
forming potentials developed by Chou and Fasman. Biochemistry 13. 211, 1974 or other such 
methods reviewed by Schulz et al. Principles in Protein Struaure, Springer-Verlag, 1978. pp. 
108-130, and on the analysis of hydrophobicity panems in proteins developed by Kyte and 
30 Doolittle, J. Mol. Biol. 157: 105-132. 1982. 
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Numerous methods for detennining percent homology are known in the art. One 
preferred method is to use version 6.0 of the GAP computer program for making sequence 
conq)arisons. The program is available from the University of Wisconsin Genetics Computer 
Group and utilizes the alignment method of Needleman and Wunsch, 7. Moi BioL 48, 443, 
5 1970, as revised by Smith and Waterman Adv. AppL Math. 2. 482, 1981. 

Nucleic acid molecules that bind to a REST-encoding nucleic acid under high 
stringency conditions are identified functionally, using methods outlined above, or by using the 
hybridization rules reviewed in Sambrook et al.. Molecular Ooning: A Laboratory Manual, 2nd 
ed.. Cold Spring Harbor Press, 1989. 
10 Antisera to REST arc made by creating a REST antigen by linking a portion of the 

cDNA for human REST to a cDNA for glutathione s-transferase ("GST") found on a 
commercial veaor. The resulting vector expresses a fusion protein containing an antigenic 
portion of REST and GST that is readily purified from the expressing bacteria using a 
gluuihione affinity column. The purified antigenic fusion protein is used to immimize rabbits. 
15 The same approach is used to make antigens based on other portions of the REST protein. 
Procedures for making antibodies and for identifying antigenic ponions of proteins are well 
known. See, for instance, Harlow, Antibodies, Cold Spring Harbor Press, 1989. 

The proteins of the invention are made, in one embodiment, using the identical 
approach as for generating REST antisera. The cDNA specific for a given REST protein or 
20 analog thereof is linked using standard means to a cDNA for GST, found on a commercial 
vector, for example. The fusion protein expressed by such a vector construct includes the 
REST protein or analog and GST, and can be treated as above for purification. Should the 
GST segment of the fusion protein interfere with function, it is removed by partial proteolytic 
digestion approaches that preferentially attack unstructured regions, such as the linkers between 
25 GST and the REST-derived protein. The linkers are designed to lack strucmre, for instance 
using the rules for secondary structure-forming potential developed by Chou and Fasman. 
Biochemistry 13, 211, 1974. The linker is also designed to incorporate protease target amino 
acids, such as, for trypsin, arginine and lysine residues. To create the linkers, standard 
synthetic approaches for making oligonucleotides are employed together with standard 
30 subcloning methodologies. Other fiision partners other than GST can be used. 

Also, of course, the REST proteins can be directly synthesized from nucleic acid (by 
the cellular machinery) without use of fusion partners. For instance, nucleic acids having the 
sequence of SEQ ID NO: 10 are subcloned into an appropriate expression vector having an 
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appropriate promoter and expressed in an appropriate organism. (Note that REST lacks 
consensus glycosylation sites and, especially since it is not a membrane or exported protein, 
should lack glycosylalions.) Antibodies against REST are employed to facilitate purification. 
Additional purifications techniques are applied as needed, including without limitation, 
5 preparative electrophoresis. FPLC (Pharmacia. Uppsala, Sweden), HPLC (e.g., using gel 

filtration, reverse-phase or mildly hydrophobic columns), gel filtration, differential precipitation 
(for instance, "salting out" precipiutions), ion-exchange chromatography and affinity 
chromatography (including affinity chromatography using the REl duplex nucleotide sequence 
as the affinity ligand). 

10 A^proiein.or nucleic acid is "isolated" in accordance with the invention in that the 

molecular cloning of the nucleic acid of interest, for exanq)le, involves taking a human REST 
nucleic acid from a human cell, and isolating it from other human-derived nucleic acids. This 
isolated nucleic acid may then be inserted into a host cell, which may be yeast or bacteria, for 
exanq)le, or another human cell. A protein or nucleic acid is "substantially pure" in accordance 

15 with the invention if it is predominantly free of other proteins or nucleic acids, respectively. A 
macromolecule, such as a nucleic acid or a protein, is predominantly free if it constitutes at 
least about 50% by weight of the given macromolecule in a composition. Preferably, the 
protein or nucleic acid of the present invention constitutes at least about 60% by weight of the 
total proteins or nucleic acids, respectively, that are present in a given composition thereof. 

20 more preferably about 80% , still more preferably about 90% , yet more preferably about 95 % . 
and most preferably about 100%. Such conq)ositions are referred to herein as being proteins or 
nucleic acids that are 60% pure, 80% pure. 90% pure, 95% pure, or 100% pure, any of which 
are substantially pure. 

One aspect of the present invention is directed to the use of "antisense" polynucleic 

25 acid to treat neural diseases, including de-differentiated neural tumor cells and diseases 
characterized by diminished neural activity. Such an approach is also used to trigger the 
differentiation of neural stem cells. The approach involves the use of an antisense molecule 
designed to bind nascent mRNA (or "sense" strand) for a REST protein, thereby stopping or 
inhibiting the translation of the mRNA. or to bind to the REST gene to mterfere with its 

30 transcription. Once the sequence of the mRNA sought to be bound is known, an antisense 
molecule is designed that binds the sense strand by the Watson-Crick base-pairing rules, 
forming a duplex strucnire analogous to the DNA double helix. Gene Regulation: Biology of 
Antisense RNA and DNA, Erikson and Ixzant, eds.. Raven Press, New York, 1991. 
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A serious barrier to fiilly exploiting this technology is the problem of efficiently 
introducing into cells a sufficient number of antisense molecules to effectively interfere with the 
translation of the targeted mRNA or the function of DNA. One method that has been 
employed to overcome this problem is to covalently modify the 5* or the 3* end of the antisense 

5 polynucleic acid molecule with hydrophobic substiruents. These modified nucleic acids 

generally gain access to the cells interior with greater efficiency. See, for example, Boutorin et 
al., FEES Lett. 23,1382-1390, 1989; Shea et al. Nucleic Acids Res, 18, 3777-3783, 1990. 
Additionally, the phosphate backbone of the antisense molecules has been modified to remove 
the negative charge {see, for example, Agris et al., Biochemistry 25, 6268, 1986; Cazenave and 

10 H^lme^n Anriseme Nucleic Acids and Proteins: Fundamentals and Applications, Mol and Van 
der Krol, cds., p. 47 et seq.. Marcel Dekker, New York, 1991) or the purine or pyrimidine 
bases have been modified (see, for exanq>le, Antisense Nucleic Acids and Proteins: 
Fundamentals and Applications^ Mol and Van der Krol, eds., p. 47 et seq,. Marcel Dekker, 
New York, 1991; Milligan et al. in Gene Therapy For Neoplastic Diseases, Huber and Laso, 

15 eds., p. 228 et seq.. New York Academy of Sciences, New York, 1994). Other attempts to 
overcome the cell penetration barrier include incorporating the antisense polynucleic acid 
sequence into an expression vector that is inserted into the cell in low copy number, but which, 
when in the cell, directs the cellular machinery to synthesize more substantial amounts of 
antisense polynucleic molecules. See, for example, Farhood et al., Ann, NY, Acad, Sci. 716, 

20 23, 1994. This strategy includes the use of recombinant viruses that have an expression site 
into which the antisense sequence has been incorporated. See, e.g., Boris-Lawrie and Temin. 
Ann, NY, Acad, Sci,, 716:59 (1994). Others have tried to increase membrane permeability by 
neutralizing the negative charges on antisense molecules or other nucleic acid molecules with 
polycations. See, e.g. Wu and Wu, Biochemistry, 27:887-892, 1988; Behr et al., Proc. Natl, 

25 Acad Sci U,S,A, 86:6982-6986, 1989. 

The jwlynucleotide or nucleic acid compositions of the invention can be administered 
orally, topically, rectally, vaginally, by pulmonary route by use of an aerosol, or parenterally, 
i.e. intramuscularly, intraventricularly, subcutaneously, iniraperitonealUy or intravenously. The 
polynucleotide compositions are administered alone, or they are combined with a 

30 pharmaceutically-acccptable carrier or excipient according to standard pharmaceutical praaice. 
For the oral mode of administration, the polynucleotide compositions are used in the form of 
tablets, capsules, lozenges, troches, powders, syrups, elixirs, aqueous solutions and 
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suspensions, and the like. In the case of tablets, carriers that are used include lactose, sodium 
citrate and salts of phosphoric acid. Various disintegrants such as starch, and lubricating agents 
such as magnesium stearate, sodium lauryl sulfate and talc, are commonly used in tablets. For 
oral administration in c^ule form, useful diluents are laaose and high molecular weight 

5 polyethylene glycols. When aqueous suspensions are required for oral use, the polynucleotide 
compositions are combined with emulsifying and suspending agents. If desired, certain 
sweetening and/or flavoring agents can be added. For parenteral administration, sterile 
solutions of the conjugate arc usually prepared, and the pH of the solutions are suitably 
adjusted and buffered. For intravenous use, the total concentration of solutes is controlled to 

IG render the preparation isotonic- For ocular administration, ointments or droppable liquids may 
be delivered by ocular delivery systems known to the art, such as applicators or eye droppers. 
Such con^x)sitions include mucomimelics, such as hyaluronic acid, chondroitin sulfate, 
hydroxypropyl methylcellulose or poly(vinyl alcohol), preservatives, such as sorbic acid or 
EDTA, and the usual quantities of diluents and/or carriers well known in the an. For 

15 pulmonary administration, diluents and/or carriers are selected so as to allow the formation of 
an aerosol. 

Generally, the polynucleotide compositions are administered in an effective amount. 
An effective amount is an amount effective to either (1) reduce the symptoms of the disease 
sought to be treated or (2) induce a pharmacological change relevant lo treating or preventing 
20 the disease sought to be treated. 

For viral gene therapy vectors, dosages are generally from about 1 ^g to about 1 mg of 
nucleic acid per kg of body mass. For non-infective gene therapy vectors, dosages are 
generally from about 1 to about 100 mg of nucleic acid per kg of body mass. Antisense 
oligonucleotide dosages are generally from about 1 /ig to about 100 mg of nucleic acid per kg 
25 of body mass. 

The invention also encompasses the use of gene therapy approaches to insert a gene 
expressing an REl binding domain but not a suppressor domain into de-differentiated tumor 
cells or neural cells with diminished neural activity. Gene therapy approaches for inserting a 
gene for a protein with REST activity into overaaive neural cells are also within the invention. 
30 Also, gene therapy approaches for inserting a gene for a REST suppressor domain linked lo a 
promoter binding element to suppress the activity of the promoter bound by the binding element 
are also within the invention. 
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For gene therapy, medical workers prefer to incorporate, into one or more cell types of 
an organism, a DN A vector capable of directing the synthesis of a protein missing from the cell 
or useful to the cell or organism when expressed in greater amounts. The methods for 
introducing DNA to cause a cell to produce a new protein or a greater amount of a protein are 
5 called "transfection" methods. See, generally, Sambrook et al.. Molecular Cloning: A 
Laboratory Manual, 2nd ed.. Cold Spring Harbor Press, 1989. 

A number of the above-discussed methods of enhancing cell penetration by antisense 
nucleic acid are generally applicable methods of incorporating a variety of nucleic acids into 
cells. Other general methods include calcium phosphate precipitation of nucleic acid and 
10 incubation with the target cells (Graham mi Van der Eb, Virology, 52:456, 1983), co- 
incubation of nucleic acid, DEAE-dextran and cells (Son^ayrac and Danna, Proc, Natl. Acad. 
ScL, 12:7575, 1981), electroporation of cells in the presence of nucleic acid (Potter et al., 
Proc, Natl. Acad. Sci., 81:7161-7165, 1984), incorporating nucleic acid into virus coats to 
create transfection vehicles (Gitman et al., Proc, Natl, Acad, ScL U.S.A,, 82:7309-7313, 1985) 
15 and incubating cells with nucleic acid incorporated into liposomes (Wang and Huang, Proc, 
Natl, Acad, Sd.. 84:7851-7855, 1987). An approach in employing gene therapy is to 
incorporate the gene sought to be introduced into the cell into a virus, such as an adenovirus. 
See. for instance, Akli et al.. Nature Genetics 3. 224. 1993. 

The stem cells that are useful in neural stem cell replacement therapy include human 
20 mesencephalic fetal brain cells, porcine fetal brain cells, human subventricular zone cells and 
glial progenitor cells, including 02A cells (which are progenitors for all glial cell types, 
including astrocytes and oligodendrocytes). 

The invention also relates to methods of measuring a REST protein or mRNA from a 
tissue or staining a tissue for a REST protein or mRNA. Useful methods of measuring mRNA 
25 include Southern blot analysis, dot blot analysis, nuclear transcription analysis, hisiochemical 
staining for mRNA and polymerase chain reaction amplification methods. See generally, 
Ausubel et al.. Current Protocols in Molecular Biology, Wiley Press, 1993; PCR Protocols, 
Cold Spring Harbor Press, 1991; and Sambrook et al.. Molecular Cloning: A Laboratory 
Manual, 2nd ed.. Cold Spring Harbor Press, 1989. For in situ nucleic acid hybridization 
30 techniques, see Saldino et al.. Methods in Enzymology 168, 761-777, 1989; Meson et al.. 
Methods in Enzymology 168, 753-761, 1989; Harper et al.. Methods in Enzymology 151, 539- 
551, 1987; Angerer et al.. Methods in Enzymology 152, 649-661, 1987; Wilcox et al., Methods 
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in Enzymology 124, 510-533, 1986. Methods of measuring protein in a tissue include enzyme- 
linked immunoassays ("ELISA"), immuno-diffusion assays, radio-immunoassays, 
immunoelectrophoresis. Western blot analyses and immunohistochemical staining techniques. 
See generally, Ausubel ct al.. Current Protocols in Molecular Biology, Wiley Press, 1993; 

5 Antibodies, a Laboratory Manual, Cold Spring Harbor Press, 1988; and Sambrook ei al.. 
Molecular Cloning: A Laboratory Manual, 2nd ed.. Cold Spring Harbor Press, 1989. 

PCR methods of amplifying nucleic acids utilize at least two primers. One of these 
primers is capable of hybridizing to a first strand of the nucleic acid to be amplified and of 
priming enzyme-driven nucleic acid synthesis in a first direction. The other is capable of 

10 hybridizing the reciproca!-£equcnc*-of-the-first-Strard_(if_the amplified is single 

stranded, this sequence is initially hypothetical, but is synthesized in the first amplification 
cycle) and of priming nucleic acid synthesis from that strand in the direction opposite the first 
direction and towards the site of hybridization for the first primer. Conditions for conducting 
such amplifications, particularly under preferred high stringency conditions, are well known. 

15 See, for example, PCR Protocols, Cold Spring Harbor Press, 1991. 

The samples that are amenable to assaying or suining for REST protein or nucleic acid 
include, without limitation, cells or tissues (including nerve tissues), protein extracts, nucleic 
acid extracts and biological fluids such as cerebral fluid, serum and plasma. Preferred samples 
are nervous system-derived samples. 

20 In screening assays for antagonists of the activity of REST, the agents to be screened 

include a great variety of chemicals including, but not limited to, biologically active molecules 
such as peptides, carbohydrates, alkaloids, aromatic compounds, polynucleotides and analogs 
thereof (particularly analogs that have been rendered more membrane permeable), DNA 
intercolating compounds and other pharmaceutical agents. One cell-free assay comprises the 

25 steps of: 

providing a nuclear extract, 
providing a REST protein, 

providing the nucleotide triphosphates necessary for transcription, 
providing a promoter sequence that includes an element effective to bind to REST and 
30 thereby be inhibited, 

providing a candidate compound or a cocktail of candidate compounds, 
mixing the extract, protein, promoter, nucleotide triphosphates, and candidate 
compound(s). 
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incubating the mixture to allow transcription to proceed, and 

determining the level of the restating transcription from the promoter, relative increases 
in transcription reflecting an inhibition of.cither the binding of REST to the promoter 
. element or the activity of the suppressor domain of REST. 
5 For nuclear extracts from REST-expressing cells, the extract itself will generally provide 

sufficient amounts of the REST protein. Sufficient amounts of the nucleotide triphosphates may 
also be found in the nuclear extract; however, generally, additional nucleotide triphosphates are 
added to reduce the variability of the assay. The level of transcription is determined by primer 
extension as described by Bodner and Karin, Cell 50, 267-275, 1987. 
10 One embo diment of the cellular assay comprises the steps of: 

providing a eukaryotic cell line that expresses the REST protein (either natively or 
through a stable or transient transfection), 
providing a suitable medium for maintaining the cell line, 
adding to the medium a candidate compound or a cocktail of candidate compounds, 
15 incubating the cells to allow transcription to proceed, and 

determining the level of transcription from a REST-responsive promoter. 
One way of determinitig the level of transcription is to have provided the cells with a REST- 
responsive promoter coupled to a gene for a readily measurable gene product. This method is, 
of course, indirect, since it requires the transcript, which one would prefer to directly measure, 
20 to be translated into a protein that is then measured. Nonetheless, the method is widely 
recognized as a surrogate measure of transcription. The appropriate RNA transcript is also 
measured by methods well known in the an, such as dot-blot hybridization or by Northern Blot 
analysis. 

The REST protein has a negative influence on the activity of many promoters having an 
25 RE I or an RE Mike sequence (such as that of the promoter for SCGIO). Direct cloning 
strategies for such negative factors are difficult since they require time consuming 
' measurements of the loss of a property. To create a positive signal that can more facilely be 
used to screen a cDNA library for REST-related cDNAs, a HeLa cell cDNA library was 
created to express fusion proteins between cDNA-encoded polypeptides and the activation 
30 domain of the yeast GAL4 regulatory protein. The library was designed to identify a clone 
encoding a fusion protein having an REl-binding domain and a GAM activation domain. Such 
a fusion protein acts as a positive transcription factor on appropriate RE 1 -containing promoter. 
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A HeU cell library was selected because HeLa cells do not express the type II voltage 
dependent sodium channel and express an REl-binding activity. 

The invention is described in more detail, but without limiution, by reference to the 
exanq>les set forth below. 

5 

Example 1 - "One.Hvbrid" C lnninf? of Three Partial Sequences 
a. Yeast Strains 

The cloning strategy employed yeast containing two reporter genes having REl 
regulatory sequences in or adjacent to their promoters. One reporter gene was HISS, which 
!0 confers to y.east_the_ability to grow in media that lacks the amino acid histidine, functionally 
attached to the yeast GALl promoter. The GALl promoter is normally inactive in the absence 
of a yeast activator protein such as GALA. The other reporter gene was the bacterial lac z gene 
functionally coupled to the yeast CYCl promoter. The CYCl promoter is normally inactive in 
the absence of a yeast activator protein such as GAL4. 

15 

I. The HJS3 Construct 
Four copies of the 28 bp REl nucleic acid, SEQ ID NO:29, which had been 
synthesized by standard oligonucleotide synthesis methods, were cloned into a unique EcoRI 
site on yeast expression shuttle vector pTHl (described by Flick and Johnson. Mol. Cell.Biol. 
20 10(9). 4757^769. 1990). The EcoRI site is adjacent (and 5') to a yeast GALl promoter that is 
functionally linked to a HIS3 gene. The shuttle vector also contained a marker gene that 
directed the expression of a gene that confers to yeast the ability to grow in the absence of the 
pyrimidine base uracil. A derivative plasmid conuining four properly oriented copies of the 
REl sequence, as confirmed by sequence analysis, was isolated and designated pJAC12. 

25 

u. The Lac z Construct 

Four copies of the 28 bp REl nucleic acid, SEQ ID NO:29. were cloned between the 
Pst and BamHI sites upstream of the CYCl promoter found on expression vector pCZi3gal 
(described by Ue and Romberg. Proc. Natl. Acad. Sci. USA 84. 8839-8843. 1993). which 
30 promoter is functionally linked to a bacterial lac z gene. The vector also contained a marker 
gene that directed the expression of a gene that confers to yeast the ability to grow in the 
absence of the amino acid tryptophan. A derivative plasmid containing four properly oriented 
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copics of the REl nucleic acid, as confirmed by sequence analysis, was isolated and designated 
pIAC13. 

tti. Yeast Transformation To Incorporate Reporter Genes 

5 The reporter plasmids were linearized and introduced sequentially into a standard yeast 

strain (strain W303) by the LiAc method (Schiestl and Geitz, Curr, Gen. 16. 339-346, 1989), 
Transformants were selected by growth on plates lacking uracil (indicating the integration of 
pJAC12) and tryptophan (indicating the integration of pJAC13). Small scale preparations of 
total yeast genomic DNA were prepared from four colonies according to the method of 

10 Sherman et al-, Affr/zody in Yeast Genetics. Cold Spring Harbor Press, 1986, to confirm 

integration of the pJAC12 and pJAC13 reporter vectors into the yeast genome by Southern blot 
analysis using the REl. CYCl promoter, HIS3 gene, and TRPl gene as probes. One of these 
four transformants was then utilized for the subsequent cDNA library transformation. This 
reporter strain was assessed for growth on his" plates and screened for )3-galactosidase activity 

15 and, as expected, was negative for both markers. 

I'v. Control Reporter Strain 

By the same methods described above, a control strain derived from W303 was created 
that incorporated analogs of pJAC12 and pJAC13, wherein the REl nucleic acids were 
20 substituted with four copies of the inactive mutant REl nucleic acid, SEQ ID No. 30, described 
by Kraner ei al.. Neuron 9, 37^, 1992. 

b. cDNA Goning 

A HeLa cell cDNA library was constructed using the pGADGH plasmid containing the 
25 GAM activation domain (see Li and Herskowitz, Science 262: 1870-1874, 1993) functionally 
linked to a GAL4 promoter and having a polylinker site (including EcoRI and Xhol sites), 
located downstream of the activator domain sequence for inserting the cDNA. The library 
plasmid contains a marker for the ability to grow in the absence of the amino acid leucine. The 
library was linearized and introduced into the yeast reporter strain by the LiAc method. The 
30 cells were plated in leucine minus and histidine minus agar plates to select colonies that are 
putatively transformed with a cDNA to express an fusion protein having an REl binding 
domain (derived from cDNA) and a GAL4 activation domain. 
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One hundred his"^ colonies were impressed onto filter paper and permeabilized by 
freeze-thawing. The filter paper was layered onto another filter paper containing the /3- 
galactosidase substrate 5-bromo-4-chloro-3-indoyl-b-D-gaIacioside (X-gal, available from Sigma 
Chemical Co., St. Louis), The filter paper was incubated at room temperature and monitored 

5 for blue spots, which indicate jS-galactosidase positive colonies. Four colonies that were 
positive for the lac z marker were isolated. Plasmids containing the cDNA from these four 
colonies was isolaed as described by Banel et al., in Cellular Interactions in Development: A 
Practical Approach, D.A. Hartley, ed.. New York: Oxford University Press, 1994, pp 53-179, 
and amplified in baaeria. The plasmids were introduced into the control yeast strain (wherein 

IG the reporter gens-promoter5-contained_mutani_REl .sequences). Three of the four plasmids 
failed to transform the control strain, indicating that the fusion proteins they encoded interacted 
specifically with the RE 1 nucleic acid. These plasmids were designated p73, p90 and p613. 
The three insert cDNAs were sequenced by the chain termination method (Sanger ct al., Proc, 
Natl. Acad, Sci. USA 74, 998-1002. 1977) and found to include the sequences of SEQ ID 

15 N0:3, SEQ ID N0:4 and SEQ ID N0:5, all of which encode overlapping portions of an 
apparent zinc-finger DNA-binding domain (nucleotides 216-1622, 636-1725 and 695-1622 of 
Fig. 1, respectively). 



Example 2 - Cloning of Two Overlapping Sequences Encoding REST 

20 SEQ ID N0:3, SEQ ID N0:4 and SEQ ID N0:5 were used to probe another HeLa cell 

cDNA library thai was cloned into the Lambda Zap II phage (Stratagene, Inc., San Diego, 
CA). Two phage isolates containing overlapping cDNAs of 3082 and 4408 bp were isolated 
(phages NH2 and NH7, respectively). These cDNAs are designated SEQ ID NO: 6 and SEQ 
ID N0:7 and encode nucleotides -175-1616 and 1472-5324 of Fig. 1. respectively. From the 

25 overlap of these two cDNAs, most of the fiill length REST cDNA can be deduced. The 5' 
segment, up to position -325, was determined by applying the 5' RACE PCR technique to 
HeLa cell cDNA. This segment is designated SEQ ID N0:1. The deduced amino acid 
sequence of REST is shown is Figure 1. Note that Lambda Zap II is readily convertible to the 
Bluescript plasmid using EcoRI as outlined by the supplier. 

30 

Exam ple 3 - Expression of REST Antigen and Polvclonal Ant ibodv Production 

For example 3, a 1.5 kilobase EcoRI-XhoI fragment of p73 comprising all of SEQ ID 
N0:3 was cloned in phase with the cDNA for gluuthione s-transferase ("GST") in the 
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commercial vector pGEX4T3 (Pharmacia, Uppsala, Sweden). The GST-REST fusion protein 
was produced in E.coli strain XL-1 blue (Siraugene, San Diego, CA) and purified on a 
glutathione-Sepharose column (Pharmacia, Uppsala, Sweden). The purified fusion protein was 
used to immunize two rabbits (Pocono Rabbit Farms, PA) to produce a polyclonal antibody 
5 preparation against REST. 

Example 4 - RNA Hybridization fNorthem Blots) 

Total cellular RNA from HeLa cells. PC12 cells, L6 skeletal muscle cells and dorsal 
root ganglion was isolated as described by Toledo- Aral et al., Neuron, in press) and 

10 po!y~rA*^selected-using„a.commerciMly Ava^ kit (Pharmacia, Inc., Uppsala, Sweden). 
Messenger RNA (2-4 ^g) was fractionated on denaturing gels and then electrophoreiically 
transferred onto nylon paper for hybridization. A DNA probe of human REST was generated 
by random primer labeling of the EcoRI - Xhol fragment of p73, which includes the nucleic 
acid of SEQ ID N0:3, to incorporate ^^p. a rat REST cDNA (600 bp) was obtained by PCR 

15 (with an initial reverse-transcriptase step) of rat skeletal muscle mRNA using a degenerate 

primer modelled on the sequence of amino acids 146 to 153 (nucleotides 481 to 504) of the plus 
strand of SEQ ID N0:1 and a degenerate primer modelled on the amino-acid-encoding 
sequence of amino acid residues 363 to 370 (nucleotides 1087 to 1 1 10) of the minus strand of 
SEQ ID N0:1. The PCR-amplified cDNA was cloned into pGEM-7Z (Promega.Madison, 

20 WI), and workable amounts of the plasmid were grown in bacteria. A rat REST riboprobe was 
manufactured by linearizing the plasmid with AccI and transcribing it with T7 polymerase in 
the presence of ^^P-UTP (Dupont, Wilmington, DE). A riboprobe for the CNS-type sodium 
channel was made as described by D'Arcangelo et al., /. Cell BioL, 10(9), 4757-4769, 1993. 
Hybridization and washing conditions used with the rat REST and sodium channel riboprobes 

25 were as described by Toledo-Aral et al.. Neuron, in press; for the human REST DNA probe, 
the hybridization and washing solutions were the same as those used for the riboprobes, except 
that the blots were hybridized at 37**C and washed at 32 *C. 

Nonhem blot analysis for mRNS for the CNS-type sodium channel and REST in a 
number of cell types and tissues produced the following results: 
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Cell or Tissue Type 


ciNo-type oOGiuni L^nannei 
mRNA 


REST mRNA 


HeLa cells 


none 


high levels 


rat L6 skeletal muscle cells 


none 


high levels 


rat PC 12 cells 


high level 


extremely low levels 


mouse dorsal root ganglia 


extremely low levels 


high levels 



Example 5 - Western Blot Analvsis 
10 Western im munob lots of proteins derived from nuclear extracts were performed 

according to standard procedures, as described by Sambrook et al.. Molecular Cloning: A 
Laboraxory Manual Cold Spring HarberLab,, Cold Spring Harbor, NY, 1989. Nuclear 
extracts were prepared by the single lysis method (Sambrook ci al., 1989). Extracts were 
combined with an equal volume of 2X Laemmli sample buffer (Laemmli, Nature, 227, 680- 

15 685, 1970) and boiled for 15 minutes. Samples were resolved by SDS-PAGE on 7.5% gels, 
transferred to nitrocellulose, and the nitrocellulose was blocked with 10% milk in TTBS 
(Sambrook et al., 1989). Immunoblotting was performed using the enhanced 
chemiluminescence method using a commercial kit (Amersham, Burlington, MA). The 
antibody to REST-GST was used at a 1:20 dilution after purification by FPLC on an alkyl 

20 Supcrose (a highly crosslinked agarose substituted with octyl groups) column (Pharmacia, 
Uppsala, Sweden). 

Nuclear extracts were made from the PC 12 cell line derived from a neural 
pheochromocyioma, which expresses the CNS-type voluge-dependent sodium channel and does 
not express an REl binding activity, and from HeU cells, which do not express the CNS-type 
25 voluge-dependcnt sodium channel and do express an REl binding activity. Western blots 
probed with the polyclonal antibodies to human REST indicated the presence of an 
immunoreactive protein of molecular weight 121 kDa in HeLa cell nuclear extracts, but no 
immunoreactive protein in PC 12 cell nuclear extracts. 



30 Example 6 ■ In Sim Hyb ridization 

The developmental pattern of expression of REST was analyzed by in situ hybridization 
in mouse embryos. A 600 bp fragmem of mouse REST cDNA (encompassing most of the zinc 
fmger domain) was prepared from 8.5 day mouse embryos by the PCR method described in 
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Example 4 for the preparation of rat REST cDNA. The amplification product was cloned into 
a Bluescript vector (Stratagene, San Diego, CA) and partially sequenced using the Sequenase 
Kit (US Biochemicals, Cleveland, OH). In situ hybridization of intact embryos using 
digoxigenin (DIG-ll-UTP, available from Boehringer Mannheim) labeled RNA probes for 

5 mouse Hox-Bl (Frohman at aL, Development. 110, 589-608, 1990), and Gbx-2 (Frohman ei 
al.. Mouse Genome, 91, 323-325, 1993), Hybridization was performed using a published 
protocol (see Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring 
Harbor Press, 1989). In brief, embryos were fixed overnight in paraformaldehyde, incubated 
in hydrogen peroxide to inactivate endogenous phosphatases, lightly proteinase K digested, 

10 refixed, and hybridized.at J.O*'C in 1 ml^of 50% fpjmamide, 5 x SSC pH 4.5, 50 ftg/ml yeast 
RNA, 1% SDS, 50 /ig/ml heparin. 0.1% CHAPS, and 5mM EDTA containing 1 ^g of probe. 
The embryos were rinsed in a low wash solution (50% formamide, 5 x SSC, pH 4.5, 1 % SDS, 
0.1 % CHAPS; 70°C), treated with RNAse A, rinsed with a high stringency wash solution 
(50% formamide, 2 x SSC, pH 4.5, 0.1% CHAPS; 65°C), and incubated with an 

15 alkaline-phosphatase coupled rabbit anti-digoxin antisera (Boehringer Mannheim, Indianapolis. 
IN) The enzyme activity of the reporter was detected by a color reaction with 5-bromo-4- 
chloro-3-indolyl phosphate (BCIP) and nitroblue tetrazolium (NBT), which resulted in the 
deposition of a water-insoluble purple precipitate. Embryos were rinsed, washed into 80% 
glycerol, and photographed intact and in slices. 

20 The in situ hybridization results for 9.5 day embryos indicated the presence of abundant 

REST mRNA in all tissues except the developing brain and spinal cord. Robust expression of 
REST mRNA was found in neural crest-derived dorsal root ganglia, indicating the expression 
of REST in some non-CNS neural tissue. 

25 Example 7 - Mobility Shift Assavs for Proteins That Bind REl Sequences 

The presence of REl binding aaivity in various cells and tissues was tested using a gel 
mobility shift assay. Nuclear extracts from HeLa, L6, and primary culnires of rat embryonic 
skeletal muscle cells were prepared as described by Dignam ct al., Nucl. Acids Res., 11, 1475- 
1489, 1983. The extracts were preincubated 15 minutes at room temperature with either buffer 

30 control, competitor DNA, REST-GST polyclonal antisera. or rabbit preimmune serum, and 
then incubated for two hours at room temperature with a 114 bp ^^P end-labeled DNA probe 
conuining nucleotides -1051 to 837 of the 5' flanking sequence for the CNS-type sodium 
channel gene, which promotes sequence includes the REl sequence. The samples were 
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resolved by electrophoresis on a 5% non-denaturing polyacrylamide gel, which was then 
autoradiographed. The presence of binding was indicated by the presence of a DNA complex 
that moved more slowly in the gel than docs the free DNA probe. 

The results were that HeLa, L6 and rat embryonic skeletal mmcle all contained an REl 
5 binding activity that was competed away with excess unlabelled REl containing DNA but not 
by DNA containing the inactive REl mutant described by Kraner et al., Neuron, 9, 37-44, 
1992. The polyclonal antisera to the REST-GST fusion further retarded mobility, while pre- 
immune serum had no effect. This result indicates that a REST-like protein is responsible for 
the binding indicated by the gel shift assay. 

10 

Example 8 - Expression Vector Encoding The Complete Human REST Protein 

The NH2 vector containing the nucleic acid of SEQ ID NO: 6 was digested with Hind 
ni and Hinc II; and the NH7 vector containing the nucleic acid of SEQ ID N0:7 was digested 
with Hinc II and Bgl II. The excised inserts were subcloned into a Hind III and Bam HI 
15 digested pCMV I-amp Gnvitrogen, Inc., San Diego) vector. The Hinc 11 digestion cleaved the 
overlap region of NH2 and NH7 at nucleotide 1575, allowing for a contiguous insen of 
nucleotides -175 through 3656 to be isolated. 

Example 9 - Transfection Studies of REST Function 
20 Transient transfection of PC 12 cells with a plasmid containing the chloramphenicol 

acetyl transferase (CAT) gene attached to the REl -containing promoter for the CNS-type 
sodium channel results in the expression of CAT (the plasmid designated herein as "type II- 
CAT"). This plasmid has been described by Kraner ct al.. Neuron, 9» 37-44. 1992. A control 
CAT vector driven by the strong rous sarcoma virus (RSV) promoter has been described by 

25 Kraner et al., 1992 and Gorman et al., Proc, NatL Acad. ScL USA 79, 6777-6781, 1982. To 
test whether this expression could be shut-down by the REST protein, cotransfection 
experiments using the type II-CAT plasmid and a plasmid containing the REST cDNA coupled 
to the cytomegalovirus ("CMV) promoter were undertaken. A fragment of the REST cDNA, 
encoding the entire REST protein, with Hindlll and Bgll termini (including nucleotides -175 to 

30 3656 of SEQ ID N0:1) was subcloned downstream of the CMV promoter in the commercial 
mammalian expression vector pCDNA 1-amp (InVitrogen, Inc., San Diego, CA) between the 
Hindlll and BamHI sites to create the CMV-REST vector. The resulting expression vector was 
designated REST-Exprcss. Rat PC12 cells were transfected with 30 Mg of REST-Express and 
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30 /ig of either type II-CAT or RSV-CAT by electroporation (Kroner ei al., 1992). Forty-eighi 
hours after transfection the cells were harvested, centrifuged and lysed by fi:eeze-thaw cycles. 
The supernatant was analyzed for CAT activity as previously described in Maue et al.» Neuron, 
4, 223-231, 1990. A cDNA encoding the Zn finger region of REST (including nucleotides 481 

5 to 1236 of SEQ ID N0:1) was cloned independently into the pCDNAl-amp vector and was 
used as an interfering form of REST in transient transfection assays. L6 muscle cells and PC12 
cells were transfected with 30 /xg of the interfering REST vector along with 30 /ig of type II- 
CAT plasmid by electroporation and treated as above. 

The results were that co-transfection into PC 12 cells of REST-Express along with the 

IP typell-CAT resulted in a ten-fold decrease in activity versus the activity seen with type II-CAT 
alone. REST-Express had not effect on the expression of CAT by RSV-CAT. The interfering 
REST vector, encoding just the DNA binding domain of REST, had no effect on the expression 
of type II-CAT in PC12 cells. However, in L6 muscle cells, which contain an endogenous 
REST activity, the interfering REST vector derepressed the expression of type II-CAT, which 

15 is otherwise inactive in L6 cells. This latter result is consistent with REST having a suppressor 
function that is held in the vicinity of the promoter for the CNS-type sodium channel by the 
DNA-binding domain. By competing the complete REST protein from the promoter, the 
interfering form of REST - containing only the DNA-binding domain - de-represses the 
promoter. 

20 

Example 10 - Localization of the Repressor Function 

A number of restriction fragments were isolated from the fiill length expression clone 
described in Example 8 or from the NH2 clone and subcloned into the CMV-promoted 
expression vector also described in Example 8. Two other REST fragments were available 
25 from cDNA library screenings. These were clones NHIO and NH12, which contain nucleotides 
121-1581 and 25-1308 of Figure 1, respectively (which sequences are designated SEQ ID 
NO:27 and 28). The inserts of these clones were excised with EcoRI and subcloned into the 
CMV-promoted vector. In total, the inserts subcloned into the expression vector had the 
following sequence from Figure 1: 

30 1. Nucleotides 31-3976 

2. Nucleotides 31-2234 

3. Nucleotides 31-1940 

4. Nucleotides 121-1581 

5. Nucleotides 25-1308 
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6. Nucleotides 31-2491 and 2683-3976 

In the last of these clones, the sequence between two BstXI restriction sites is excised. These 
subclones are co-transfected with PC 12 cells along with the type II-CAT plasmid as described 
5 above to detennine the silencing potential of the expressed fragment. 

f.xample 1 1 - Designing PCR Am plification Primers 

The PCR primers used to amplify sequences encoding amino acid residues 146 through 
370 in Example 4 were designed as follows. First, the 146 to 153 sequence was translated into 
10 the following sequence-encoding nucleic acid sequence (SEQ ID N0:8): 
TGYAARCCNTGYCARTAYGARGeN, 
where Y = T/C, R ^ A/G and N = A/G/T/C. Next, the sequence of amino acid residues 363 
to 370 was translated as above. This translated sequence was used to define the following 
opposite strand sequence (SEQ ID N0:9): 
15 NGTYTTRTARTCRCARTGNGGRCA. 

While this invention has been described with an emphasis upon preferred embodiments, 
it will be obvious to those of ordinary skill in the art that variations in the preferred 
compositions and methods may be used and that it is intended that the invention may be 
20 practiced otherwise than as specifically described herein. Accordingly, this invention includes 
all modifications encompassed within the spirit and scope of the invention as defined by the 
claims that follow the Sequence Listing. 
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SBQUKNCB LISTING 

(1) GENERAL INFORMATION 
(i) APPLICANT: Mandel, Gail, Chong, Jayhong A. 
5 (ii). TITLE OF INVENTION: REST Protein and DNA 

(iii) NUMBER OF SEQUENCES: 29 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Dechert Price & Rhoads 

(B) STREET: P.O. Box 5218 
10 (C) CITY: Princeton 

(D) STATE: New Jersey 

(E) COUNTRY: USA 

(F) 2IP : 03543 - 521-3 

(v) COMPUTER READABLE FORM: 

15 (A) MEDIUM TYPE: Diskette, 3.50 inch, 1.44 Mb storage 

(B) COMPUTER: IBM-compatible 

(C) OPERATING SYSTEM: DOS 5.0 

(D) SOFTWARE: WordPerfect 

(vi) CURRENT APPLICATION DATA: 
20 (A) APPLICATION NUMBER: 

(B) FILING DATE: March 23, 1995 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 
(A) NAME: Allen Bloom 

25 (B) REGISTRATION NUMBER: 29,135 

(C> REFERENCE/DOCKET NUMBER: 317743-101 WO 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (609) 520-3214 

(B) TELEFAX: (609) 520-3259 

30 (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 5648 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(vi) ORIGINAL SOURCE: 
40 (A) ORGANISM: Human 

(H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
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(A) LIBRARY: CDNA 
(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia -Ramirez Jos6, 
Toledo-Aral, 

5 Juam, Zheng, Yingcong, Boutros, Michael C, Altschuler, Yelena 
M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 
10 (D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 199S 

(K) RELEVANT RESIDUES IN SEQ ID N0:1:FR0M -1 TO 5648 
15 (Xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

ATCTGGCGCG GCGTAGCCCT GTGTTGGAAT GTGCGGCTGC CGCGAGCTCG 50 

CGGCGCAGCA GCGGAGCGAG CGCCGCCGAG GCCCGGGGCC CCAGACCCTG 100 

20 

GCGGCGGCTG CGGCAGCCGA GACGGCAGGG CGAGGCCCGG AGGCCTGAGC 150 

ACCCTCTGCA GCCCCACTCC TGGGCCTTCT TGGTCCACGA CGGCCCCAGC 200 

25 ACCCAACTTT ACCACCCTCC CCCACCTCTC CCCCGAAACT CCAGCAACAA 250 

AGAAAAGTAG TCGGAGAAGG AGCGGCGACT CAGGGTCGCC CGCCCCTCCT 300 

CACCGAGGAA GGCCGAATAC AGTT 324 

30 

ATG GCC ACC CAG GTA ATG GGG CAG TCT TCT GGA GGA GGA GGG CTG 369 

Met Ala Thr Gin Val Met Gly Gin Ser Ser Gly Gly Gly Gly Leu 

15 10 15 

35 TTT ACC AGC AGT GGC AAC ATT GGA ATG GCC CTG CCT AAC GAC ATG 414 
Phe Thr Ser Ser Gly Asn He Gly Met Ala Leu Pro Asn Asp Met 
20 25 30 

TAT GAC TTG CAT GAC CTT TCC AAA GCT GAA CTG GCC GCA CCT CAG 459 
40 Tyr Asp Leu His Asp Leu Ser Lys Ala Glu Leu Ala Ala Pro Gin 

35 40 45 
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CTT ATT ATG CTG GCA AAT GTG GCC TTA ACT GGG GAA GTA AAT GGC 504 
Leu lie Met Leu Ala Asn Val Ala Leu Thr Gly Glu Val Asn Gly 
50 55 €0 

5 AGC TGC TGT GAT TAC CTG GTC GGT GAA GAA AGA CAG ATG GCA GAA 549 
Ser Cys Cys Asp Tyr Leu Val Gly Glu Glu Arg Gin Met Ala Glu 
65 70 75 

CTG ATG CCG GTT GGG GAT AAC AAC TTT TCA GAT AGT GAA GAA GGA 594 
10 Leu Met Pro Val Gly Asp Asn Asn Phe Ser Asp Ser Glu Glu Gly 

80 85 90 

GAA -GGA -CTT GAA GAG TCT GCT -GAT ATA AAA GGT GAA CCT -CAT GGA 639 
Glu Gly Leu Glu Glu Ser Ala Asp lie Lys Gly Glu Pro His Gly 
15 95 100 105 



CTG GAA AAC ATG GAA CTG AGA AGT TTG GAA CTC AGC GTC GTA GAA 684 

Leu Glu Asn Met Glu Leu Arg Ser Leu Glu Leu Ser Val Val Glu 
110 115 120 

20 

CCT CAG CCT GTA TTT GAG GCA TCA GGT GCT CCA GAT ATT TAC AGT 729 

Pro Gin Pro Val Phe Glu Ala Ser Gly Ala Pro Asp lie Tyr Ser 
125 130 135 



25 TCA AAT AAA GCT CTT GCC CCT GAA ACA CCT GGA GCG GAG GAC AAA 774 
Ser Asn Lys Ala Leu Ala Pro Glu Thr Pro Gly Ala Glu Asp Lys 
140 145 150 

GGC AAG AGC TCG AAG ACC AAA CCC TTT CGC TGT AAG CCA TGC CAA 619 
30 Gly Lys Ser Ser Lys Thr Lys Pro Phe Arg Cys Lys Pro Cys Gin 

155 160 165 

TAT GAA GCA GAA TCT GAA GAA CAG TTT GTG CAT CAC ATC AGA GTT 864 
Tyr Glu Ala Glu Ser Glu Glu Gin Phe Val His His He Arg Val 
35 170 175 180 



CAC AGT GCT AAG AAA TTT TTT GTG GAA GAG AGT GCA GAG AAG CAG 909 
His Ser Ala Lys Lys Phe Phe Val Glu Glu Ser Ala Glu Lys Gin 
185 190 195 



40 
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GCA AAA GCC AGG GAA TCT GGC TCT TCC ACT GCA GAA GAG GGA GAT 954 
Ala Lys Ala Arg Glu Ser Gly Ser Ser Thr Ala Glu Glu Gly Asp 
200 205 210 

5 TTC TCC AAG GGC CCC ATT CGC TGT GAC CGC TGC GGC TAC AAT ACT 999 
Phe Ser Lys Gly Pro lie Arg Cys Asp Arg Cys Gly Tyr Asn Thr 
215 220 225 

AAT CGA TAT GAT CAC TAT ACA GCA CAC CTG AAA CAC CAC ACC AGA 1044 
10 Asn Arg Tyr Asp His Tyr Thr Ala His Leu Lys His His Thr Arg 

230 235 240 

GCT GGG GAT AAT GAG CGA OTC TAC AAG TGT ATC ATT TGC AGA-TAC 1089 
Ala Gly Asp Asn Glu Arg Val Tyr Lys Cys lie lie Cys Thr Tyr 
15 245 250 255 

ACA ACA GTG AGC GAG TAT CAC TGG AGG AAA CAT TTA AGA AAC CAT 1134 
Thr Thr Val Ser Glu Tyr His Trp Arg Lys His Leu Arg Asn His 
260 265 270 

20 

TTT CCA AGG AAA GTA TAC ACA TGT GGA AAA TGC AAC TAT TTT TCA 1179 
Phe Pro Arg Lys Val Tyr Thr Cys Gly Lys Cys Asn Tyr Phe Ser 
275 280 285 

25 GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT ACA GGA 1224 
Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His Thr Gly 
290 295 300 

GAA CGC CCA TAT AAA TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 1269 
30 Glu Arg Pro Tyr Lys Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 

305 310 315 

AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT TCA GGT GAG AAG 1314 
Lys Thr His Leu Thr Arg His Met Arg Thr His Ser Gly Glu Lys 
35 320 325 330 

CCA TTT AAA TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 1359 
Pro Phe Lys Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 
335 340 345 

40 
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GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 1404 
Glu Val Thr Arg His Ala Arg Gin Val His Asn Gly Pro Lys Pro 
350 355 360 

5 CTT AAT TGC CCA CAC TGT GAT TAC AAA ACA GCA GAT AGA AGC AAC 1449 
Leu Asn Cys Pro His Cys Asp Tyr Lys Thr Ala Asp Arg Ser Asn 
365 370 375 

TTC AAA AAA CAT GTA GAG CTA CAT GTG AAC CCA CGG CAG TTC AAT 1494 
10 Phe Lys Lys His Val Glu Leu His Val Asn Pro Arg Gin Phe Asn 

380 385 390 

TGC CCT GTA T\iT GAC TAT GCA GCT TCG AAG AAG TGT AAT CTA CAG 153 9 
Cys Pro Val Cys Asp Tyr Ala Ala Ser Lys Lys Cys Asn Leu Gin 
IS 395 400 405 



20 



TAT CAC TTC AAA TCT AAG CAT CCT ACT TGT CCT AAT AAA ACA ATG 1584 

Tyr His Phe Lys Ser Lys His Pro Thr Cys Pro Asn Lys Thr Met 
410 415 420 

GAT GTC TCA AAA GTG AAA CTA AAG AAA ACC AAA AAA CGA GAG GCT 1629 

Asp Val Ser Lys Val Lys Leu Lys Lys Thr Lys Lys Arg Glu Ala 
425 430 435 



25 GAC TTG CCT GAT AAT ATT ACC AAT 
Asp Leu Pro Asp Asn lie Thr Asn 
440 

ACA AAA ATA AAA GGG GAT GTG GCT 
30 Thr Lys He Lys Gly Asp Val Ala 

455 

GTC AAA GCA GAG AAA AGA GAT GTC 
Val Lys Ala Glu Lys Arg Asp Val 
35 470 



GAA AAA ACA GAA ATA GAA CAA 1674 
Glu Lys Thr Glu He Glu Gin 
445 450 

GGA AAG AAA AAT GAA AAG TCC 1719 
Gly Lys Lys Asn Glu Lys Ser 
460 465 

TCA AAA GAG AAA AAG CCT TCT 1764 
Ser Lys Glu Lys Lys Pro Ser 
475 480 



AAT AAT GTG TCA GTG ATC CAG GTG ACT ACC AGA ACT CGA AAA TCA 1809 
Asn Asn Val Ser Val He Gin Val Thr Thr Arg Thr Arg Lys Ser 
485 490 495 



40 
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GTA ACA GAG GTG AAA GAG ATG GAT GTG CAT ACA GGA AGC AAT TCA 1854 
Val Thr Glu Val Lys Glu Met Asp Val His Thr Gly Ser Asn Ser 
500 505 510 

5 GAA AAA TTC AGT AAA ACT AAG AAA AGC AAA AGG AAG CTG GAA GTT 1899 
Glu Lys Phe Ser Lys Thr Lys Lys Ser Lys Arg Lys Leu Glu Val 
515 520 525 

GAC AGC CAT TCT TTA CAT GGT CCT GTG AAT GAT GAG GAA TCT TCA 1944 
10 Asp Ser His Ser Leu His Gly Pro Val Asn Asp Glu Glu Ser Ser 

530 535 540 

-AGA -AAA- -AAG -AAA AAG AAG GTA GAA AGC AAA TCC AAA- AAT AAT AGT 198 9 
Thr Lys Lys Lys Lys Lys Val Glu Ser Lys Ser Lys Asn Asn Ser 
15 545 550 555 

CAG GAA GTG CCA AAG GGT GAC AGC AAA GTG GAG GAG AAT AAA AAG 2034 

Gin Glu Val Pro Lys Gly Asp Ser Lys Val Glu Glu Asn Lys Lys 
560 565 570 

20 

CAA AAT ACT TGC ATG AAA AAA AGT ACA AAG AAG AAA ACT CTG AAA 2079 

Gin Asn Thr Cys Met Lys Lys Ser Thr Lys Lys Lys Thr Leu Lys 
575 580 585 

25 AAT AAA TCA AGT AAG AAA AGC AGT AAG CCT CCT CAG AAG GAA CCT 2124 
Asn Lys Ser Ser Lys Lys Ser Ser Lys Pro Pro Gin Lys Glu Pro 
590 595 600 

GTT GAG AAG GGA TCT GCT CAG ATG GAC CCT CCT CAG ATG GGG CCT 2169 
30 Val Glu Lys Gly Ser Ala Gin Met Asp Pro Pro Gin Met Gly Pro 

605 610 615 

GCT CCC ACA GAG GCG GTT CAG AAG GGG CCC GTT CAG GTG GAG CTG 2214 
Ala Pro Thr Glu Ala Val Gin Lys Gly Pro Val Gin Val Glu Leu 
35 620 625 630 

CCA CCT CCC ATG GAG CAT GCT CAG ATG GAG GGT GCC CAG ATA CGG 2259 
Pro Pro Pro Met Glu His Ala Gin Met Glu Gly Ala Gin lie Arg 
635 640 645 

40 
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CCT GCT CCT GAC GAG CCT GTT CAG ATG GAG GTG GTT CAG GAG GGG 2304 
Pro Ala Pro Asp Glu Pro Val Gin Met Glu Val Val Gin Glu Gly 
650 655 660 

5 CCT GCT CAG AAG GAG CTG CTG CCT CCC GTG GAG CCT GCT CAG ATG 234 9 
Pro Ala Gin Lys Glu Leu Leu Pro Pro Val Glu Pro Ala Gin Met 
665 670 675 

GTG GGT GCC CAA ATT GTA CTT GCT CAC ATG GAG CTG CCT CCT CCC 2394 
10 Val Gly Ala Gin He Val Leu Ala His Met Glu Leu Pro Pro Pro 

680 685 690 

ATG GAG ACT GCT CAG ACG GAG GTT GCC CAA ATG GGG CCT OCT CCC 2439 
Met Glu Thr Ala Gin Thr Glu Val Ala Gin Met Gly Pro Ala Pro 
15 695 700 705 

ATG GAA CCT GCT CAG ATG GAG GTT GCC CAG GTA GAA TCT GCT CCC 2484 

Met Glu Pro Ala Gin Met Glu Val Ala Gin Val Glu Ser Ala Pro 
710 715 720 

20 

ATG CAG GTG GTC CAG AAG GAG CCT GTT CAG ATG GAG CTG TCT CCT 2529 

Met Gin Val Val Gin Lys Glu Pro Val Gin Met Glu Leu Ser Pro 
725 730 735 

25 CCC ATG GAG GTG GTC CAG AAG GAG CCT GTT CAG ATA GAG CTG TCT 2574 
Pro Met Glu Val Val Gin Lys Glu Pro Val Gin lie Glu Leu Ser 
740 745 750 

CCT CCC ATG GAG GTG GTC CAG AAG GAA CCT GTT AAG ATA GAG CTG 2619 
30 Pro Pro Met Glu Val Val Gin Lys Glu Pro Val Lys He Gl-u Leu 

755 760 765 

TCT CCT CCC ATA GAG GTG GTC CAG AAG GAG CCT GTT CAG ATG GAG 2664 
Ser Pro Pro He Glu Val Val Gin Lys Glu Pro Val Gin Met Glu 
35 770 775 780 

TTG TCT CCT CCC ATG GGG GTG GTT CAG AAG GAG CCT GCT CAG AGG 2709 
Leu Ser Pro Pro Met Gly Val Val Gin Lys Glu Pro Ala Gin Arg 
785 790 795 

40 
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GAG CCA CCT CCT CCC AGA GAG CCT CCC CTT CAC ATG GAG CCA ATT 2754 
Glu Pro Pro Pro Pro Arg Glu Pro Pro Leu His Met Glu Pro lie 
800 805 610 

5 TCC AAA AAG CCT CCT CTC CGA AAA GAT AAA AAG GAA AAG TCT AAC 2799 
Ser Lys Lys Pro Pro Leu Arg Lys Asp Lys Lys Glu Lys Ser Asn 
815 820 825 

ATG CAG AGT GAA AGG GCA CGG AAG GAG CAA GTC CTT ATT GAA GTT 2844 
10 Met Gin Ser Glu Arg Ala Arg Lys Glu Gin Val Leu lie Glu Val 

830 835 840 



GGC TTA GTG CCT GTT AAA GAT AGC TGG CTT eXA AAG GAA AGT GTA -23 S 5 
Gly Leu Val Pro Val Lys Asp Ser Trp Leu Leu Lys Glu Ser Val 
15 845 850 855 

AGC ACA GAG GAT CTC TCA CCA CCA TCA CCA CCA CTG CCA AAG GAA 2934 
Ser Thr Glu Asp Leu Ser Pro Pro Ser Pro Pro Leu Pro Lys Glu 
860 865 870 

20 

AAT TTA AGA GAA GAG GCA TCA GGA GAC CAA AAA TTA CTC AAC ACA 2979 
Asn Leu Arg Glu Glu Ala Ser Gly Asp Gin Lys Leu Leu Asn Thr 
875 880 885 

25 GGT GAA GGA AAT AAA GAA GCC CCT CTT CAG AAA GTA GGA GCA GAA 3024 
Gly Glu Gly Asn Lys Glu Ala Pro Leu Gin Lys Val Gly Ala Glu 
890 895 900 

GAG GCA GAT GAG AGC CTA CCT GGT CTT GCT GCT AAT ATC AAC GAA 3069 
30 Glu Ala Asp Glu Ser Leu Pro Gly Leu Ala Ala Asn He Asn Glu 

905 910 915 

TCT ACC CAT ATT TCA TCC TCT GGA CAA AAC TTG AAT ACG CCA GAG 3114 
Ser Thr His He Ser Ser Ser Gly Gin Asn Leu Asn Thr Pro Glu 
35 920 925 930 

GGT GAA ACT TTA AAT GGT AAA CAT CAG ACT GAC AGT ATA GTT TGT 3159 
Gly Glu Thr Leu Ash Gly Lys His Gin Thr Asp Ser He Val Cys 
935 940 945 



40 
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GAA ATG AAA ATG GAC ACT GAT CAG AAC ACA AGA GAG AAT CTC ACT 3204 
Glu Met Lys Met Asp Thr Asp Gin Asn Thr Arg Glu Asn Leu Thr 
950 955 960 

5 GGT ATA AAT TCA ACA GTT GAA GAA CCA GTT TCA CCA ATG CTT CCC 3249 
Gly lie Asn Ser Thr Val Glu Glu Pro Val Ser Pro Met Leu Pro 
965 970 975 

CCT TCA GCA GTA GAA GAA CGT GAA GCA GTG TCC AAA ACT GCA CTG 3294 
10 Pro Ser Ala Val Glu Glu Arg Glu Ala Val Ser Lys Thr Ala Leu 

980 985 990 

GCA TCA CCT CCT GCT ACA AT-Q GCA GCA AAT GAG TCT CAG GAA ATT 3 33 S 
Ala Ser Pro Pro Ala Thr Met Ala Ala Asn Glu Ser Gin Glu He 
15 995 1000 1005 

GAT GAA GAT GAA GGC ATC CAC AGC CAT GAA GGA AGT GAC CTA AGT 3384 
Asp Glu Asp Glu Gly He His Ser His Glu Gly Ser Asp Leu Ser 
1010 1015 1020 

20 

GAC AAC ATG TCA GAG GGT AGT GAT GAT TCT GGA TTG CAT GGG GCT 3429 
Asp Asn Met Ser Glu Gly Ser Asp Asp Ser Gly Leu His Gly Ala 
1025 1030 1035 

25 CGG CCA GTT CCA CAA GAA TCT AGC AGA AAA AAT GCA AAG GAA GCC 3474 
Arg Pro Val Pro Gin Glu Ser Ser Arg Lys Asn Ala Lys Glu Ala 
1040 1045 1050 

TTG GCA GTC AAA GCG GCT AAG GGA GAT TTT GTT TGT ATC TTC TGT 3519 
30 Leu Ala Val Lys Ala Ala Lys Gly Asp Phe Val Cys lie Phe Cys 

1055 1060 1065 

GAT CGT TCT TTC AGA AAG GGA AAA GAT TAC AGC AAA CAC CTC AAT 3564 
Asp Arg Ser Phe Arg Lys Gly Lys Asp Tyr Ser Lys His Leu Asn 
35 1070 1075 1080 

CGC CAT TTG GTT AAT GTG TAC TAT CTT GAA GAA GCA GCT CAA GGG 3609 
Arg His Leu Val Asn Val Tyr Tyr Leu Glu Glu Ala Ala Gin Gly 
1085 1090 1095 

40 

CAG GAG TAATG AAACTTTGAA CAAGGTTTCA GTTCTTAGTT 3650 
Gin Glu 
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TGTAAGGTAT ATTACATTTT ATATTCATTT ATGATAGCAG ACAACCTTTT 3700 

AAGATTGCTT TAATTAGTAT CTGATGTTGA TTTTTAAGTG GCATTCTTTT 3750 

5 CCTTAGGACT TTTTATGTAT ACCTGTTGAT TGTTGTGTAA ATTTTAGTAA 3800 

ATCTAAGAGA GTGTACTAAA CCAGCAGGTA TCTGTTAGCT TATGTGTTTA 3850 

ATTGAAATTA GAAGGCTAAG ATGGTATAAC AGCATTTTAT TGCTTTGTCC 3900 

10 

AGCTACAACA TGTCATTTTT TTCTCCATGT CTTATCTTCC TGTTTCACTT 3950 

_nj-RO'T*fn*T*i^ftgT<^ _»T!?f*^<^*T!*T5?m?n??T?, _'»\»T??n/^7ir5^'T»/2»r _ii't*m\ tv t\ w Bfr?r__/5/5/'2fP!t*'h /5fpfT*R a f\nr\ 

15 ATAGCAAATT ACTTGAAGAA TTTGCCTGCT TTATATAAAG TTAGCACTTT 4050 

AAGATTTTTT TTTTAGAGAT GAGAAGACAT TTAAATTGAA GAAAAATTCC 4100 

CCCAGCAATA GACAGTCTAT CAGTCCAAGT ATTTACTTCC TGAGTTTTGA 4150 

20 

TCAATATTTT TTATTTGTGT ATGTTAATCG TCATAAAAAC AGTGATTTTG 4200 

GTGTGTTTTT TATTTTGGTG CTTTAATGGC TTAAGATGTT GCACATTTTT 4250 

25 IIUTTC T T TT GGTTTCTGTT TATGTTTTTT TGCCTATGCA GTTAAATTTT 4300 

TCCTAGAAAT AGCATTTGTG TTGAACAGTA ACACTTTATA CATATATATA 4350 

TGCATGTTTA TTTTGTTTGG CGTCTTTGGA GGGATGCTTT TAGACTTGTT 4400 

30 

TGCAAAAGGG CAGTTTTCTT TTTCTTTGCT GCAGTTGTCT ATTTTGCAGA 4450 

ATAATAGTGT GTGCAAGTTT GTGAGCAAAT GAAATATGCA GGTTCAATCT 4500 

35 ATTGATTTTG ATTTTTACAT CTTATATCTA TGCCAGAATC TGTATTTCAT 4550 

ATAACTTATT TATTTCGAAT GGATGTAGTA AATTCACAGC TATCAGTTTT 4600 

GATTTTGCAA TAAATAAACC ACTAGGTTGC ATGTCGAACA AATTTTTATC 4650 

40 

TCAAATACCA ACCATCAGTT TTTTTTTTCA TGTGTTTTGG TACAGCTAAT 4700 
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TCCTAATTGT AGAGTGTTAA ATGTTTGAGG AGAACCTTTT CTCATAGATG 4750 

GTTGGTGTTC ATATGGCNAC TTTACAATAA AGAGAACTGT AAGTGATATT 4800 

5 TGGAAACTAC AAACCTGGAA TTAGGAGATA TAATTATTCC TTCAAGTTTT 4850 

ATAGATATCA CTTGGGAGAT TCCAAAGCCA TAGCTATTAC GCNGCAAACC 4900 

TAGGATAAGA AAGGTAGTAT GAGTGCTGGT AGACCAGCTG CAACATTTCC 4950 

TATATCAGAT GAAAAAGGCT GGTGAAACAA GTACAGTCCA GATTTTTTAA 5000 



10 



30 



AATCATACTT TCTCAGGGAT CTCCACAAAC TGGTGGGTGT CCTGGCTGTC S050 

15 TGTGTGATAG CCTCTTTCTA TAGGTGAGGC CTCAAATGAA TTGCAGCTAT 5100 

CCTGGTGTTC CTATGAGGGC ACTTGTATGA AAAAGGCAGT ACTCCAAAAC 5150 

ATTTTTGATG GTTCTTTGGC CAGTTGCCAA AGAGTGTGAA AGAATCCAAT 5200 

20 

AGAGGATTTT TCTTACTGAT AGCAGTCATT CATTGCAGTA AAATAAAATA 5250 

TGAATTCCCA TTAGGGAATC TTGAATTCTG ACCTCCCATA CTCCGTTTTG 5300 

25 AAATAACCAC TTATATTTCA TTTTTTAAAA ATCTGATGAT CTCTTTGAGG 5350 

CAGGTTTCAG ATTTGGCAGT ACAACATGAA AGATTAGGAA AAGCATTAAT 5400 



AACGTGTGGG TGGAAAGCTT GTTAAAAATC TGAGAGTGAA GTTTGAGTTA 54 50 



AAAGTTGTTT GACATGGCAT TGACTGGGAG GCCAAAGATT TAAAGAAGCG 5500 

GAAGATTCTT CTCTTAAGAC ATGAGGAGTA AGTTGTGTGA TAATGGTATG 5550 

35 TGTTTTGTGT GCATGAATGG ACATTGTAAA TGTTGAATTC TAGGCTCCGA 5600 

CAATCATTGT CAACAGAAGA TAAAGCTGCA AATATTTATG TTTTAAAA 5648 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 756 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 
10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human 
(H) CELL LINE: HeLa 
(vii) imEDlATE SOURCE : 
(A) LIBRARY: cDNA 
15 (x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramlrez Jos^, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M. , Frohman, Michael A., Kraner, Susan D. , Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
20 Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

25 (G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO:2:FR0M 1 TO 756 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

TGT AAG CCA TGC CAA 15 
30 Cys Lys Pro Cys Gin 

165 

TAT GAA GCA GAA TCT GAA GAA CAG TTT GTG CAT CAC ATC AGA GTT 60 
Tyr Glu Ala Glu Ser Glu Glu Gin Phe Val His His lie Arg Val 
35 170 175 180 

CAC AGT GCT AAG AAA TTT TTT GTG GAA GAG AGT GCA GAG AAG CAG 105 
His Ser Ala Lys Lys Phe Phe Val Glu Glu Ser Ala Glu Lys Gin 
185 190 195 

40 
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GCA AAA GCC AGG GAA TCT GGC TCT 
Ala Lys Ala Arg Glu Ser Gly Ser 
200 

TTC TCC AAG GGC CCC ATT CGC TGT 
5 Phe Ser Lys Gly Pro lie Arg Cys 
215 



-41 - 

TCC ACT GCA GAA GAG GGA GAT 150 

Ser Thr Ala Glu Glu Gly Asp 

205 210 

GAC CGC TGC GGC TAC AAT ACT 195 

Asp Arg Cys Gly Tyr Asn Thr 

220 225 



AAT CGA TAT GAT CAC TAT ACA GCA CAC CTG AAA CAC CAC ACC AGA 240 
Asn Arg Tyr Asp His Tyr Thr Ala His Leu Lys His His Thr Arg 
10 230 235 240 



GCT GGG GAT AAT 
Ala Gly Asp Asn 

15 

ACA ACA GTG AGC 
Thr Thr Val Ser 

20 TTT CCA AGG AAA 
Phe Pro Arg Lys 



GAG CGA GTC TAC 
Glu Arg Val Tyr 
245 

GAG TAT CAC TGG 
Glu Tyr His Trp 
260 

GTA TAC ACA TGT 
Val Tyr Thr Cys 
275 



AAG TGT ATC ATT 

Lys Cyis lie lie 
250 

AGG AAA CAT TTA 
Arg Lys His Leu 
265 

GGA AAA TGC AAC 
Gly Lys Cys Asn 
280 



TGC ACA TAC 285 
cys Thr Tyr 
255 

AGA AAC CAT 330 
Arg Asn His 
270 

TAT TTT TCA 375 
Tyr Phe Ser 
285 



GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT ACA GGA 420 
25 Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His Thr Gly 

290 295 300 

GAA CGC CCA TAT AAA TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 465 
Glu Arg Pro Tyr Lys Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 
30 305 310 - 315 



35 



AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT TCA GGT GAG AAG 510 
Lys Thr His Leu Thr Arg His Met Arg Thr His Ser Gly Glu Lys 
320 325 330 

CCA TTT AAA TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 555 
Pro Phe Lys Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 
335 340 345 



40 



GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 
Glu Val Thr Arg His Ala Arg Gin Val His Asn Gly Pro Lys Pro 
350 355 360 



600 
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CTT AAT TGC CCA CAC 
Leu Asn Cys Pro His 
365 

5 TTC AAA AAA CAT GTA 
Phe Lys Lys His Val 
380 

TGC CCT GTA TGT GAC 
10 Cys Pro Val Cys Asp 

395 

TAT CAC TTC AAA TCT 
Tyr His Phe Lys Ser 
IS 410 



-42- 

TGT GAT TAC AAA ACA 
Cys Asp Tyr Lys Thr 
370 

GAG CTA CAT GTG AAC 
Glu Leu His Val Asn 
385 

TAT GCA GCT TCC AAG 
Tyr Ala Ala Ser Lys 
400 

AAG CAT 
Lys His 



GCA GAT AGA AGC AAC 645 
Ala Asp Arg Ser Asn 
375 

CCA CGG CAG TTC AAT 690 
Pro Arg Gin Phe Asn 
390 

AAG TGT AAT CTA CAG 735 
Lys Cys Asn Leu Gin 
405 

756 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS 
20 (A) LENGTH: 1407 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
25 (iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: H\iman 

(H) CELL LINE: HeLa 
30 (vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA 
(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A.. Tapia-Ramirez Jos§, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 

35 Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 
40 (E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 
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(K) RELEVAKT RESIDUES IN SEQ ID NO: 3: FROM 1 TO 1407 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

G ATG GCA GAA 10 
5 Met Ala Glu 
75 



15 



35 



CTG 


ATG 


CCG 


GTT 


GGG 


GAT 


AAC 


AAC 


TTT 


TCA 


GAT 


AGT 


GAA 


GAA 


GGA 


55 


Leu 


Met 


Pro 


Val 


Gly 


Asp 


Asn 


Asn 


Phe 


Ser 


Asp 


Ser 


Glu 


Glu Gly 












Q n 
oQ 










85 










90 




GAA 


GGA 


CTT 


GAA 


GAG 


TCT 


GCT 


GAT 


ATA 


AAA 


GGT 


GAA 


CCT 




\J\jf\ 


100 


Glu 




T A«t 


Glu 


-Giu 


Ser 


Ala 


-Asp 


Xie 


100 






-Pre 




105 




CTG 


GAA 


AAC 


ATG 


GAA 


CTG 


AGA 


AGT 


TTG 


GAA 


CTC 


AGC 


GTC 






145 


Leu 


Glu 


Asn 


Met 


Glu 
XJ.U 


Leu 


Arg 


Ser 


Leu 


Glu 
115 


Leu 


Ser 


Val 


Val 


Glu 
120 




CCT 


CAG 


CCT 


GTA 


TTT 


GAG 


GCA 


TCA 


GGT 


GCT 


CCA 


GAT 


ATT 


TAC 


AGT 


190 


Pro 


Gin 


Pro 


Val 


Phe 
p 


Glu 


Ala 


Ser 


Gly 


Ala 

X J U 


Pro 


Asp 


He 


Tyr 


Ser 
135 




TCA 


AAT 


AAA 


GCT 


CTT 


GCC 


CCT 


GAA 


ACA 


CCT 


GGA 


GCG 


GAG 


GAC 


AAA 


235 


Ser 


Asn 


Lys 


Ala 


Leu 
140 


Ala 


Pro 


Glu 


Thr 


Pro 
145 


Gly 


Ala 


Glu 


Asp 


Lys 
150 




GGC 


AAG 


AGC 


TCG 


AAG 


ACC 


AAA 


CCC 


TTT 


CGC 


TGT 


AAG 


CCA 


TGC 


CAA 


280 


Gly 


Lys 


Ser 


Ser 


Lys 
155 


Thr 


Lys 


Pro 


Phe 


Arg 
160 


Cys 


Lys 


Pro 


Cys 


Gin 
165 




TAT 


GAA 


GCA 


GAA 


TCT 


GAA 


GAA 


CAG 


TTT 


GTG 


CAT 


CAC 


ATC 


AGA 


GTT 


325 


Tyr 


Glu 


Ala 


Glu 


Ser 


Glu 


Glu 


Gin 


Phe 


Val 


His 


His 


He 


Arg Val 












170 










175 










180 




CAC 


AGT 


GCT 


AAG 


AAA 


TTT 


TTT 


GTG 


GAA 


GAG 


AGT 


GCA 


GAG 


AAG 


CAG 


370 


His 


Ser 


Ala 


Lys 


Lys 
185 


Phe 


Phe 


Val 


Glu 


Glu 
190 


Ser 


Ala 


Glu 


Lys 


Gin 
195 





40 



GCA AAA GCC AGG GAA TCT GGC TCT TCC ACT GCA GAA GAG GGA GAT 
Ala Lys Ala Arg Glu Ser Gly Ser Ser Thr Ala Glu Glu Gly Asp 
200 205 210 



415 
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TTC TCC AAG GGC CCC ATT CGC TGT GAC CGC TGC GGC TAC AAT ACT 460 
Phe Ser Lys Gly Pro He Arg Cys Asp Arg Cys Gly Tyr Asn Thr 
215 220 225 

5 AAT CGA TAT GAT CAC TAT ACA GCA CAC CTG AAA CAC CAC ACC AGA 505 
Asn Arg Tyr Asp His Tyr Thr Ala His Leu Lys His His Thr Arg 
230 235 240 

GOT GGG GAT AAT GAG CGA GTC TAC AAG TGT ATC ATT TGC ACA TAC 550 
10 Ala Gly Asp Asn Glu Arg Val Tyr Lys Cys He He Cys Thr Tyr 

245 250 255 

ACA ACA GTG AGO GAG TAT CAC TGG AGG AAA CAT TTA AGA -AAC GAT 555 
Thr Thr Val Ser Glu Tyr His Trp Arg Lys His Leu Arg Asn His 
IS 260 265 270 

TTT CCA AGG AAA GTA TAC ACA TGT GGA AAA TGC AAC TAT TTT TCA 640 
Phe Pro Arg Lys Val Tyr Thr Cys Gly Lys Cys Asn Tyr Phe Ser 
275 280 285 

20 

GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT ACA GGA 685 
Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His Thr Gly 
290 295 300 

25 GAA CGC CCA TAT AAA TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 730 
Glu Arg Pro Tyr Lys Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 
305 310 315 

AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT TCA GGT GAG AAG 775 
30 Lys Thr His Leu Thr Arg His Met Arg Thr His Ser Gly Glu Lys 

320 325 330 

CCA TTT AAA TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 820 
Pro Phe Lys Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 
35 335 340 345 

GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 865 

Glu Val Thr Arg His Ala Arg Gin Val His Asn Gly Pro Lys Pro 
350 " 355 360 

40 
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CTT AAT TGC CCA CAC 
Leu Asn Cys Pro His 
365 

5 TTC AAA AAA CAT GTA 
Phe Lys Lys His Val 
380 

TGC CCT GTA TGT GAC 
10 Cys Pro Val Cys Asp 



-45- 

TGT GAT TAC AAA ACA GCA 
Cys Asp Tyr Lys Thr Ala 
370 

GAG CTA CAT GTG AAC CCA 
Glu Leu His Val Asn Pro 
385 

TAT GCA GCT TCC AAG AAG 
Tyr Ala Ala Ser Lys Lys 



GAT AGA AGC AAC 910 
Asp Arg Ser Asn 
375 

CGG CAG TTC AAT 955 
Arg Gin Phe Asn 
390 

TGT AAT CTA CAG 1000 
Cys Asn Leu Gin 



395 400 405 

TAT "CAC TTC AAA TCT AAG CAT CCT ACT TGT CCT AAT AAA ACA ATG 104-5 

Tyr His Phe Lys Ser Lys His Pro Thr Cys Pro Asn Lys Thr Met 
15 410 415 420 



GAT GTC TCA AAA 
Asp Val Ser Lys 

GAC TTG CCT GAT 
Asp Leu Pro Asp 



GTG AAA CTA AAG 
Val Lys Leu Lys 
425 

AAT ATT ACC AAT 
Asn He Thr Asn 
440 



AAA ACC AAA AAA 
Lys Thr Lys Lys 
430 

GAA AAA ACA GAA 
Glu Lys Thr Glu 
445 



CGA GAG GCT 1090 
Arg Glu Ala 
435 

ATA GAA CAA 1135 
He Glu Gin 
450 



25 ACA AAA ATA AAA GGG GAT GTG GCT 
Thr Lys He Lys Gly Asp Val Ala 
455 

GTC AAA GCA GAG AAA AGA GAT GTC 
30 Val Lys Ala Glu Lys Arg Asp Val 

470 

AAT AAT GTG TCA GTG ATC CAG GTG 
Asn Asn Val Ser Val He Gin Val 
35 485 



GGA AAG AAA AAT GAA AAG TCC 1180 
Gly Lys Lys Asn Glu Lys Ser 
460 465 

TCA AAA GAG AAA AAG CCT TCT 1225 
Ser Lys Glu Lys Lys Pro Ser 
475 480 

ACT ACC AGA ACT CGA AAA TCA 1270 
Thr Thr Arg Thr Arg Lys Ser 
490 495 



GTA ACA GAG GTG AAA GAG ATG GAT GTG CAT ACA GGA AGC AAT TCA 1315 
Val Thr Glu Val Lys Glu Met Asp Val His Thr Gly Ser Asn Ser 
500 505 510 



40 
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GAA AAA TTC AGT AAA ACT AAG AAA AGC AAA AGG AAG CTG GAA GTT 1360 
Glu Lys Phe Ser Lys Thr Lys Lys Ser Lys Arg Lys Leu Glu Val 
515 520 525 



5 GAC AGC CAT TCT TTA CAT GGT CCT GTG AAT GAT GAG GAA TCT TCA 1405 
Asp Ser His Ser Leu His Gly Pro Val Asn Asp Glu Glu Ser Ser 
530 535 540 

AC 1407 

10 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 1090 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hioman 
(H) CELL LINE: HeLa 
(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 
25 (x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia- Ramirez Jose, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C. , Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D. , Mandel, Gail 

(B) TITLE: REST: A Manunalian Silencer Protein that Restricts 
30 Sodium Channel Gene Expression to Neurons 

(C> JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

35 (G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID N0:4:FR0M 1 TO 1090 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
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C AAG GGC CCC ATT CGC TGT GAC CGC TGC GGC TAG AAT ACT 40 
Lys Gly Pro lie Arg Cys Asp Arg Cys Gly Tyr Asn Thr 
215 220 225 

5 AAT CGA TAT GAT CAC TAT ACA GCA CAC CTG AAA CAC CAC ACC AGA 85 
Asn Arg Tyr Asp His Tyr Thr Ala His Leu Lys His His Thr Arg 
230 235 240 

GCT GGG GAT AAT GAG CGA GTC TAC AAG TGT ATC ATT TGC ACA TAC 130 
10 Ala Gly Asp Asn Glu Arg Val Tyr Lys Cys lie He Cys Thr Tyr 

245 250 255 

ACA ACA GTG AGC GAG TAT CAC TGG' AGG AAA CAT TTA AGA AftC CAT 175 
Thr Thr Val Ser Glu Tyr His Trp Arg Lys His Leu Arg Asn His 
15 260 265 270 

TTT CCA AGG AAA GTA TAC ACA TGT GGA AAA TGC AAC TAT TTT TCA 220 

Phe Pro Arg Lys Val Tyr Thr Cys Gly Lys Cys Asn Tyr Phe Ser 
275 280 285 

20 

GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT ACA GGA 265 

Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His Thr Gly 
290 295 300 

25 GAA CGC CCA TAT AAA TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 310 
Glu Arg Pro Tyr Lys Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 
305 310 315 

AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT TCA GGT GAG AAG 355 
30 Lys Thr His Leu Thr Arg His Met Arg Thr His Ser Gly Glu Lys 

320 325 330 

CCA TTT AAA TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 400 
Pro Phe Lys Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 
35 335 340 345 

GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 44 5 
Glu Val Thr Arg His Ala Arg Gin Val His Asn Gly Pro Lys Pro 
350 355 360 



40 



wo 96/29433 PCT/US96/03940 



-48- 

CTT AAT TGC CCA CAC TGT GAT TAC AAA ACA GCA GAT AGA AGC AAC 490 
heu Asn Cys Pro His Cys Asp Tyr Lys Thr Ala Asp Arg Ser Asn 
365 370 375 

5 TTC AAA AAA CAT GTA GAG CTA CAT GTG AAC CCA CGG CAG TTC AAT 535 
Phe Lys Lys His Val Glu Leu His Val Asn Pro Arg Gin Phe Asn 
380 385 390 

TGC CCT GTA TGT GAC TAT GCA GCT TCC AAG AAG TGT AAT CTA CAG 580 
10 Cys Pro Val Cys Asp Tyr Ala Ala Ser Lys Lys Cys Asn Leu Gin 

395 400 405 

TAT CAC TTC AAA TCT AAG CAT CCT ACT TGT CCT AAT AAA ACA ATG 625 

Tyr His Phe Lys Ser Lys His Pro Thr Cys Pro Asn Lys Thr Met 

15 410 415 420 

GAT GTC TCA AAA GTG AAA CTA AAG AAA ACC AAA AAA CGA GAG GCT 670 

Asp Val Ser Lys Val Lys Leu Lys Lys Thr Lys Lys Arg Glu Ala 
425 430 435 

20 

GAC TTG CCT GAT AAT ATT ACC AAT GAA AAA ACA GAA ATA GAA CAA 715 

Asp Leu Pro Asp Asn lie Thr Asn Glu Lys Thr Glu lie Glu Gin 
440 445 450 

25 ACA AAA ATA AAA GGG GAT GTG GCT GGA AAG AAA AAT GAA AAG TCC 760 
Thr Lys He Lys Gly Asp Val Ala Gly Lys Lys Asn Glu Lys Ser 
455 460 465 

GTC AAA GCA GAG AAA AGA GAT GTC TCA AAA GAG AAA AAG CCT TCT 605 
30 Val Lys Ala Glu Lys Arg Asp Val Ser Lys Glu Lys Lys Pro Ser 

470 475 480 

AAT AAT GTG TCA GTG ATC CAG GTG ACT ACC AGA ACT CGA AAA TCA 850 
Asn Asn Val Ser Val He Gin Val Thr Thr Arg Thr Arg Lys Ser 
35 485 490 495 

GTA ACA GAG GTG AAA GAG ATG GAT GTG CAT ACA GGA AGC AAT TCA 895 

Val Thr Glu Val Lys Glu Met Asp Val His Thr Gly Ser Asn Ser 

500 505 510 

40 
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GAA AAA TTC AGT AAA ACT AAG AAA AGC AAA AGG AAG CTG GAA GTT 940 
Glu Lys Phe Ser Lys Thr Lys Lys Ser Lys Arg Lys Leu Glu Val 
515 520 525 

GAC AGC CAT TCT TTA CAT GGT COT GTG AAT GAT GAG GAA TCT TCA 985 
Asp Ser His Ser Leu His Gly Pro Val Asn Asp Glu Glu Ser Ser 
530 535 540 

ACA AAA AAG AAA AAG AAG GTA GAA AGC AAA TCC AAA AAT AAT AGT 1030 
Thr Lys Lys Lys Lys Lys Val Glu Ser Lys Ser Lys Asn Asn Ser 
545 550 555 

w%Nj wLtx vrA\3 wwv vj\3x v3/t\- >Mac >i>i>i: 'oAWj V3>i\j 'ciAG AAT AAA 'AAG 1075 

Gin Glu Val Pro Lys Gly Asp Ser Lys Val Glu Glu Asn Lys Lys 
15 560 565 570 

CAA AAT ACT TGC ATG 1090 
Gin Asn Thr Cys Met 
575 

20 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 928 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 
30 (vi) ORIGINAL SOURCE: 

<A) ORGANISM: Human 
(H) CELL LINE: HeLa 
(Vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 
35 (X) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramlrez Jose, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
40 Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 
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(E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID N0:5:FR0M 1 TO 928 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CA GCA CAC CTG AAA CAC CAC AGO AGA 26 
Ala His Leu Lys His His Thr Arg 
235 240 

GCT GGG GAT AAT GAG CGA GTC TAG AAG TGT ATC ATT TGC ACA TAC 71 

Ala Gly Asp Asn Glu Arg Val Tyr Lys Cys lie lie Cys Thr Tyr 

245 250 255 

15 ACA ACA GTG AGC GAG TAT CAC TGG AGG AAA CAT TTA AGA AAC CAT 116 

Thr Thr Val Ser Glu Tyr His Trp Arg Lys His Leu Arg Asn His 

260 265 270 

TTT CCA AGG AAA GTA TAC ACA TGT GGA AAA TGC AAC TAT TTT TCA 161 

20 Phe Pro Arg Lys Val Tyr Thr Cys Gly Lys Cys Asn Tyr Phe Ser 

275 280 285 

GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT ACA GGA 206 

Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His Thr Gly 

25 290 295 300 

GAA CGC CCA TAT AAA TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 251 

Glu Arg Pro Tyr Lys Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 

305 310 315 

30 

AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT TCA GGT GAG AAG 296 
Lys Thr His Leu Thr Arg His Met Arg Thr His Ser Gly Glu Lys 

320 325 330 

35 CCA TTT AAA TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 341 
Pro Phe Lys Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 

335 340 345 

GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 386 

40 Glu Val Thr Arg His Ala Arg Gin Val His Asn Gly Pro Lys Pro 

350 355 360 
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CTT AAT TGC CCA CAC TGT GAT TAC AAA ACA GCA GAT AGA AGC AAC 431 
Leu Asn Cys Pro His Cys Asp Tyr Lys Thr Ala Asp Arg Ser Asn 
365 370 375 

5 TTC AAA AAA CAT GTA GAG CTA CAT GTG AAC CCA CGG CAG TTC AAT 476 
Phe Lys Lys His Val Glu Leu His Val Asn Pro Arg Gin Phe Asn 
380 385 390 

TGC CCT GTA TGT GAC TAT GCA GCT TCC AAG AAG TGT AAT CTA CAG 521 
10 Cys Pro Val Cys Asp Tyr Ala Ala Ser Lys Lys Cys Asn Leu Gin 

395 400 405 

TAT CAC TTC AAA fCT AAG CAT CCT ACT TGT CCT AAT AAA ACA ATG 566 
Tyr His Phe Lys Ser Lys His Pro Thr Cys Pro Asn Lys Thr Met 
15 410 415 420 

GAT GTC TCA AAA GTG AAA CTA AAG AAA ACC AAA AAA CGA GAG GCT 611 

Asp Val Ser Lys Val Lys Leu Lys Lys Thr Lys Lys Arg Glu Ala 
425 430 435 

20 

GAC TTG CCT GAT AAT ATT ACC AAT GAA AAA ACA GAA ATA GAA CAA 656 

Asp Leu Pro Asp Asn lie Thr Asn Glu Lys Thr Glu lie Glu Gin 
440 445 450 

25 ACA AAA ATA AAA GGG GAT GTG GCT GGA AAG AAA AAT GAA AAG TCC 701 
Thr Lys lie Lys Gly Asp Val Ala Gly Lys Lys Asn Glu Lys Ser 
455 460 465 

GTC AAA GCA GAG AAA AGA GAT GTC TCA AAA GAG AAA AAG CCT TCT 746 
30 Val Lys Ala Glu Lys Arg Asp Val Ser Lys Glu Lys Lys Pro Ser 

470 475 480 

AAT AAT GTG TCA GTG ATC CAG GTG ACT ACC AGA ACT CGA AAA TCA 791 
Asn Asn Val Ser Val He Gin Val Thr Thr Arg Thr Arg Lys Ser 
35 465 490 495 

GTA ACA GAG GTG AAA GAG ATG GAT GTG CAT ACA GGA AGC AAT TCA 836 
Val Thr Glu Val Lys Glu Met Asp Val His Thr Gly Ser Asn Ser 
500 505 510 

40 
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GAA AAA TTC AGT AAA ACT AAG AAA AGC AAA AGO AAG CTG GAA GTT 8B1 

Glu Lys Phe Ser Lys Thr Lys Lys Ser Lys Arg Lys Leu Glu Val 
515 520 525 

GAC AGC CAT TCT TTA CAT GGT CCT GTG AAT GAT GAG GAA TCT TCA 926 

Asp Ser His Ser Leu His Gly Pro Val Asn Asp Glu Glu Ser Ser 
530 535 540 

AC 928 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERIsTieS 
(A) LENGTH: 1791 base pairs 

15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 
20 (iv) ANTI -SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 
(H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
25 (A) LIBRARY: cDNA 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramlrez Jos^, Toledo- 
Aral, Juaui, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M. , Frohman, Michael A., Kraner, Susain D., Mandel , Gail 
30 (B) TITLE: REST: A Mammalian Silencer Protein that Restricts 

Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 
35 (F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID N0:6:FR0M 1 TO 1791 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

40 CACCCTCTGC AGCCCCACTC CTGGGCCTTC TTGGTCCACG ACGGCCCCAG 50 

CACCCAACTT TACCACCCTC CCCCACCTCT CCCCCGAAAC TCCAGCAACA 100 
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AAGAAAAGTA GTCGGAGAAG GAGCGGCGAC TCAGGGTCGC CCGCCCCTCC 150 
TCACCGAGGA AGGCCGAATA CAGTT 175 

5 ATG GCC ACC CAG GTA ATG GGG CAG TCT TCT GGA GGA GGA GGG CTG 220 
Met Ala Thr Gin Val Met Gly Gin Ser Ser Gly Gly Gly Gly Leu 
^5 10 15 



10 



TTT ACC AGC AGT GGC AAC ATT GGA ATG GCC CTG CCT AAC GAC ATG 265 
Phe Thr Ser Ser Gly Asn He Gly Met Ala Leu Pro Asn Asp Met 
20 25 30 



TAT GAC TTG CAT iSAC CTT TCC AAA GCT GAA CTG GCC GCA CCT CAG 310 
Tyr Aap Leu His Asp Leu Ser Lys Ala Glu Leu Ala Ala Pro Gin 
15 35 40 45 

CTT ATT ATG CTG GCA AAT GTG GCC TTA ACT GGG GAA GTA AAT GGC 355 

Leu He Met Leu Ala Asn Val Ala Leu Thr Gly Glu Val Asn Gly 
50 55 60 

20 

AGC TGC TGT GAT TAC CTG GTC GGT GAA GAA AGA CAG ATG GCA GAA 400 

Ser Cys Cys Asp Tyr Leu Val Gly Glu Glu Arg Gin Met Ala Glu 
65 70 75 

25 CTG ATG CCG GTT GGG GAT AAC AAC TTT TCA GAT AGT GAA GAA GGA 445 
Leu Met Pro Val Gly Asp Asn Asn Phe Ser Asp Ser Glu Glu Gly 
80 85 90 

GAA GGA CTT GAA GAG TCT GCT GAT ATA AAA GGT GAA CCT CAT GGA 490 
30 Glu Gly Leu Glu Glu Ser Ala Asp He Lys Gly Glu Pro His Gly 

95 100 105 

CTG GAA AAC ATG GAA CTG AGA AGT TTG GAA CTC AGC GTC GTA GAA 535 
Leu Glu Asn Met Glu Leu Arg Ser Leu Glu Leu Ser Val Val Glu 
35 110 115 120 

CCT CAG CCT GTA TTT GAG GCA TCA GGT GCT CCA GAT ATT TAC AGT 580 

Pro Gin Pro Val Phe Glu Ala Ser Gly Ala Pro Asp He Tyr Ser 
125 130 135 

40 
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TCA AAT AAA GCT CTT GCC CCT GAA ACA CCT GGA GCG GAG GAC AAA 625 
Ser Asn Lys Ala Leu Ala Pro Glu Thr Pro Gly Ala Glu Asp Lys 
140 145 150 

5 GGC AAG AGC TCG AAG ACC AAA CCC TTT CGC TGT AAG CCA TGC CAA 670 
Gly Lys Ser Ser Lys Thr Lys Pro Phe Arg Cys Lys Pro Cys Gin 
155 160 165 

TAT GAA GCA GAA TCT GAA GAA GAG TTT GTG CAT CAC ATC AGA GTT 715 
10 Tyr Glu Ala Glu Ser Glu Glu Gin Phe Val His His lie Arg Val 

170 175 180 

CAC AGT GCT AAG AAA TTT TTT GTG GAA GAG AGT GCA GAG AAG CAG 760 
His Ser Ala Lys Lys Phe Phe Val Glu Glu Ser Ala Glu Lys Gin 
15 185 190 195 

GCA AAA GCC AGG GAA TCT GGC TCT TCC ACT GCA GAA GAG GGA GAT 805 
Ala Lys Ala Arg Glu Ser Gly Ser Ser Thr Ala Glu Glu Gly Asp 
200 205 210 

20 

TTC TCC AAG GGC CCC ATT CGC TGT GAC CGC TGC GGC TAG AAT ACT 850 
Phe Ser Lys Gly Pro lie Arg Cys Asp Arg Cys Gly Tyr Asn Thr 
215 220 225 

25 AAT CGA TAT GAT CAC TAT ACA GCA CAC CTG AAA CAC CAC ACC AGA 895 
Asn Arg Tyr Asp His Tyr Thr Ala His Leu Lys His His Thr Arg 
230 235 240 

GCT GGG GAT AAT GAG CGA GTC TAC AAG TGT ATC ATT TGC ACA TAC 940 
30 Ala Gly Asp Asn Glu Arg Val Tyr Lys Cys lie lie Cys Thr Tyr 

245 250 255 

ACA ACA GTG AGC GAG TAT CAC TGG AGG AAA CAT TTA AGA AAC CAT 985 
Thr Thr Val Ser Glu Tyr His Trp Arg Lys His Leu Arg Asn His 
35 260 265 270 

TTT CCA AGG AAA GTA TAC ACA TGT GGA AAA TGC AAC TAT TTT TCA 1030 
Phe Pro Arg Lys Val Tyr Thr Cys Gly Lys Cys Asn Tyr Phe Ser 
275 280 285 



40 
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GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT ACA GGA 1075 
Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His Thr Gly 
290 295 300 

5 GAA CGC CCA TAT AAA TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 1120 
Glu Arg Pro Tyr Lys Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 
305 310 315 

AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT TCA GGT GAG AAG 1165 
10 Lys Thr His Leu Thr Arg His Met Arg Thr His Ser Gly Glu Lys 

320 325 330 

CCA TTT AAA TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 1210 
Pro Phe Lys Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 
15 335 340 345 

GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 1255 
Glu Val Thr Arg His Ala Arg Gin Val His Asn Gly Pro Lys Pro 
350 355 360 

20 

CTT AAT TGC CCA CAC TGT GAT TAC AAA ACA GCA GAT AGA AGC AAC 1300 
Leu Asn Cys Pro His Cys Asp Tyr Lys Thr Ala Asp Arg Ser Asn 
365 370 375 

25 TTC AAA AAA CAT GTA GAG CTA CAT GTG AAC CCA CGG CAG TTC AAT 1345 
Phe Lys Lys His Val Glu Leu His Val Asn Pro Arg Gin Phe Asn 
380 365 390 

TGC CCT GTA TGT GAC TAT GCA GCT TCC AAG AAG TGT AAT CTA CAG 13 90 
30 Cys Pro Val Cys Asp Tyr Ala Ala Ser Lys Lys Cys Asn Leu Gin 

395 400 405 

TAT CAC TTC AAA TCT AAG CAT CCT ACT TGT CCT AAT AAA ACA ATG 1435 
Tyr His Phe Lys Ser Lys His Pro Thr Cys Pro Asn Lys Thr Met 
35 410 415 420 

GAT GTC TCA AAA GTG AAA CTA AAG AAA ACC AAA AAA CGA GAG GCT 1480 
Asp Val Ser Lys Val Lys Leu Lys Lys Thr Lys Lys Arg Glu Ala 
425 430 435 

40 
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GAC TTG CCT GAT AAT ATT ACC AAT GAA AAA ACA GAA ATA GAA CAA 1525 
Asp Leu Pro Asp Asn He Thr Asn Glu Lys Thr Glu He Glu Gin 
440 445 450 

5 ACA AAA ATA AAA GGG GAT GTG GCT GGA AAG AAA AAT GAA AAG TCC 1570 
Thr Lys He Lys Gly Asp Val Ala Gly Lys Lys Asn Glu Lys Ser 
455 460 465 

GTC AAA GCA GAG AAA AGA GAT GTC TCA AAA GAG AAA AAG CCT TCT 1615 
10 Val Lys Ala Glu Lys Arg Asp Val Ser Lys Glu Lys Lys Pro Ser 

470 475 480 

AAT AAT GTG TCA GTG ATC CAG GTG ACT ACC AGA ACT CGA AAA TCA 1660 
Asn Asn Val Ser Val He Gin Val Thr Thr Arg Thr Arg Lys Ser 
15 485 490 495 

GTA ACA GAG GTG AAA GAG ATG GAT GTG CAT ACA GGA AGC AAT TCA 1705 
Val Thr Glu Val Lys Glu Met Asp Val His Thr Gly Ser Asn Ser 
500 505 510 

20 

GAA AAA TTC AGT AAA ACT AAG AAA AGC AAA AGG AAG CTG GAA GTT 1750 
Glu Lys Phe Ser Lys Thr Lys Lys Ser Lys Arg Lys Leu Glu Val 
515 520 525 

25 GAC AGC CAT TCT TTA CAT GGT CCT GTG AAT GAT GAG GAA TC 1791 
Asp Ser His Ser Leu His Gly Pro Val Asn Asp Glu Glu 
530 535 

(2) INFORMATION FOR SEQ ID NO: 7: 
30 (i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 3705 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: CDNA to tnRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 

40 (H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 
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(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A. , Tap ia -Ramirez Jos6, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D, , Mandel, Gail 
5 (B) TITIiE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 
10 (F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO : 7: FROM 1 TO 37G5 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

15 GA ACT CGA AAA TCA 
Thr Arg Lys Ser 
495 

GTA ACA GAG GTG AAA GAG ATG GAT GTG CAT ACA GGA AGC AAT TCA 59 
20 Val Thr Glu Val Lys Glu Met Asp Val His Thr Gly Ser Asn Ser 

500 505 510 



GAA AAA TTC AGT AAA ACT AAG AAA AGC AAA AGG AAG CTG GAA GTT 104 
Glu Lys Phe Ser Lys Thr Lys Lys Ser Lys Arg Lys Leu Glu Val 
25 515 520 525 

GAC AGC CAT TCT TTA CAT GGT CCT GTG AAT GAT GAG GAA TCT TCA 149 
Asp Ser His Ser Leu His Gly Pro Val Asn Asp Glu Glu Ser Ser 
530 535 540 

30 

ACA AAA AAG AAA AAG AAG GTA GAA AGC AAA TCC AAA AAT AAT AGT 194 
Thr Lys Lys Lys Lys Lys Val Glu Ser Lys Ser Lys Asn Asn Ser 
545 550 555 

35 CAG GAA GTG CCA AAG GGT GAC AGC AAA GTG GAG GAG AAT AAA AAG 239 
Gin Glu Val Pro Lys Gly Asp Ser Lys Val Glu Glu Asn Lys Lys 
560 565 570 

CAA AAT ACT TGC ATG AAA AAA AGT ACA AAG AAG AAA ACT CTG AAA 284 
40 Gin Asn Thr Cys Met Lys Lys Ser Thr Lys Lys Lys Thr Leu Lys 

575 580 585 
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AAT AAA TCA AGT AAG AAA AGC ACT AAG CCT CCT CAG AAG QAA CCT 329 
Asn Lys Ser Ser Lys Lys Ser Ser Lys Pro Pro Gin Lys Glu Pro 
590 595 600 

5 GTT GAG AAG GGA TCT GCT CAG ATG GAC CCT CCT CAG ATG GGG CCT 374 
Val Glu Lys Gly Ser Ala Gin Met Asp Pro Pro Gin Met Gly Pro 
605 610 615 

GCT CCC ACA GAG GCG GTT CAG AAG GGG CCC GTT CAG GTG GAG CTG 419 
10 Ala Pro Thr Glu Ala Val Gin Lys Gly Pro Val Gin Val Glu Leu 

620 625 630 

CCA CCT CCC ATG GAG CAT GCT CAG ATG GAG GGT GCC CAG ATA CGG 464 
Pro Pro Pro Met Glu His Ala Gin Met Glu Gly Ala Gin lie Arg 
15 635 640 645 



20 



CCT GCT CCT GAC GAG CCT GTT CAG ATG GAG GTG GTT CAG GAG GGG 
Pro Ala Pro Asp Glu Pro Val Gin Met Glu Val Val Gin Glu Gly 
650 655 660 

CCT GCT CAG AAG GAG CTG CTG CCT CCC GTG GAG CCT GCT CAG ATG 
Pro Ala Gin Lys Glu Leu Leu Pro Pro Val Glu Pro Ala Gin Met 
665 670 675 



509 



554 



25 GTG GGT GCC CAA ATT GTA CTT GCT CAC ATG GAG CTG CCT CCT CCC 599 
Val Gly Ala Gin He Val Leu Ala His Met Glu Leu Pro Pro Pro 
680 6B5 690 



ATG GAG ACT GCT CAG ACG GAG GTT GCC CAA ATG GGG CCT. GCT CCC 644 
30 Met Glu Thr Ala Gin Thr Glu Val Ala Gin Met Gly Pro Ala Pro 

695 "'00 

ATG GAA CCT GCT CAG ATG GAG GTT GCC CAG GTA GAA TCT GCT CCC 689 
Met Glu Pro Ala Gin Met Glu Val Ala Gin Val Glu Ser Ala Pro 
35 710 715 720 

ATG CAG GTG GTC CAG AAG GAG CCT GTT CAG ATG GAG CTG TCT CCT 734 
Met Gin Val Val Gin Lys Glu Pro Val Gin Met Glu Leu Ser Pro 
725 730 735 

40 
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CCC ATG GAG GTG GTC CAG AAG GAG CCT GTT CAG ATA GAG CTG TCT 779 
Pro Met Glu Val Val Gin Lys Glu Pro Val Gin lie Glu Leu Ser 
740 745 750 

5 CCT CCC ATG GAG GTG GTC CAG AAG GAA CCT GTT AAG ATA GAG CTG 824 
Pro Pro Met Glu Val Val Gin Lys Glu Pro Val Lys lie Glu Leu 
755 7€0 765 

TCT CCT CCC ATA GAG GTG GTC CAG AAG GAG CCT GTT CAG ATG GAG 869 
10 Ser Pro Pro lie Glu Val Val Gin Lys Glu Pro Val Gin Met Glu 

770 775 780 

TTG TCT CCT CCC ATG GGG GTG GTT CAG AAG GAG CCT GCT CAG AGG 914 
Leu Ser Pro Pro Met Gly Val Val Gin Lys Glu Pro Ala Gin Arg 
15 785 790 795 



20 



GAG CCA CCT CCT CCC AGA GAG CCT CCC CTT CAC ATG GAG CCA ATT 959 
Glu Pro Pro Pro Pro Arg Glu Pro Pro Leu His Met Glu Pro lie 
800 805 610 

TCC AAA AAG CCT CCT CTC CGA AAA GAT AAA AAG GAA AAG TCT AAC 1004 
Ser Lys Lys Pro Pro Leu Arg Lys Asp Lys Lys Glu Lys Ser Asn 
815 820 825 



25 ATG CAG AGT GAA AGG GCA CGG AAG 
Met Gin Ser Glu Arg Ala Arg Lys 
830 

GGC TTA GTG CCT GTT AAA GAT AGC 
30 Gly Leu Val Pro Val Lys Asp Ser 

845 

AGC ACA GAG GAT CTC TCA CCA CCA 
Ser Thr Glu Asp Leu Ser Pro Pro 
35 860 



GAG CAA GTC CTT ATT GAA GTT 1049 
Glu Gin Val Leu lie Glu Val 
835 840 

TGG CTT CTA AAG GAA AGJ GTA 1094 
Trp Leu Leu Lys Glu Ser Val 
850 855 

TCA CCA CCA CTG CCA AAG GAA 1139 
Ser Pro Pro Leu Pro Lys Glu 
865 870 



AAT TTA AGA GAA GAG GCA TCA GGA GAC CAA AAA TTA CTC AAC ACA 1184 
Asn Leu Arg Glu Glu Ala Ser Gly Asp Gin Lys Leu Leu Asn Thr 
875 B80 885 



40 
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GGT GAA GGA AAT AAA GAA GCC CCT CTT CAG AAA GTA GGA GCA GAA 1229 
Gly Glu Gly Asn Lys Glu Ala Pro Leu Gin Lys Val Gly Ala Glu 
890 895 900 

5 GAG GCA GAT GAG AGC CTA CCT GGT CTT GCT GCT AAT ATC AAC GAA 1274 
Glu Ala Asp Glu Ser Leu Pro Gly Leu Ala Ala Asn lie Asn Glu 
905 910 915 

TCT ACC CAT ATT TCA TCC TCT GGA CAA AAC TTG AAT ACG CCA GAG 1319 
10 Ser Thr His lie Ser Ser Ser Gly Gin Asn Leu Asn Thr Pro Glu 

920 925 930 

GGT GAA ACT TTA AAT GGT AAA CAT CAG ACT GAC AGT ATA GTT TGT 1364 
Gly Glu Thr Leu Asn Gly Lys His Gin Thr Asp Ser lie Val Cys 
15 935 940 945 

GAA ATG AAA ATG GAC ACT GAT CAG AAC ACA AGA GAG AAT CTC ACT 1409 
Glu Met Lys Met Asp Thr Asp Gin Asn Thr Arg Glu Asn Leu Thr 
950 955 960 

20 

GGT ATA AAT TCA ACA GTT GAA GAA CCA GTT TCA CCA ATG CTT CCC 1454 
Gly He Asn Ser Thr Val Glu Glu Pro Val Ser Pro Met Leu Pro 
965 970 975 

25 CCT TCA GCA GTA GAA GAA CGT GAA GCA GTG TCC AAA ACT GCA CTG 1499 
Pro Ser Ala Val Glu Glu Arg Glu Ala Val Ser Lys Thr Ala Leu 
980 985 990 

GCA TCA CCT CCT GCT ACA ATG GCA GCA AAT GAG TCT CAG GAA ATT 1544 
30 Ala Ser Pro Pro Ala Thr Met Ala Ala Asn Glu Ser Gin Glu lie 

995 1000 1005 

GAT GAA GAT GAA GGC ATC CAC AGC CAT GAA GGA AGT GAC CTA AGT 1589 
Asp Glu Asp Glu Gly He His Ser His Glu Gly Ser Asp Leu Ser 
35 1010 1015 1020 

GAC AAC ATG TCA GAG GGT AGT GAT GAT TCT GGA TTG CAT GGG GCT 1634 
Asp Asn Met Ser Glu Gly Ser Asp Asp Ser Gly Leu His Gly Ala 
1025 1030 1035 

40 
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CGG CCA GTT CCA CAA GAA TCT AGC AGA AAA AAT GCA AAG GAA GCC 1679 
Arg Pro Val Pro Gin Glu Ser Ser Arg Lys Asn Ala Lys Glu Ala 
1040 104S 1050 

5 TTG GCA GTC AAA GCG GCT AAG GGA GAT TTT GTT TGT ATC TTC TGT 1724 
Leu Ala Val Lys Ala Ala Lys Gly Asp Phe Val Cys He Phe Cys 
1055 1060 1065 

GAT CGT TCT TTC AGA AAG GGA AAA GAT TAC AGC AAA CAC CTC AAT 1769 
10 Asp Arg Ser Phe Arg Lys Gly Lys Asp Tyr Ser Lys His Leu Asn 

1070 1075 1080 

CGC CAT TTG GTT AAT GTG TAC TAT CTT GAA GAA GCA GCT CAA GGG 1814 
Arg His Leu Val Asn Val Tyr Tyr Leu Glu Glu Ala Ala Gin Gly 
15 1085 1090 1095 

CAG GAG TAATG AAACTTTGAA CAAGGTTTCA GTTCTTAGTT 1855 
Gin Glu 

20 

TGTAAGGTAT ATTACATTTT ATATTCATTT ATGATAGCAG ACAACCTTTT 1905 

AAGATTGCTT TAATTAGTAT CTGATGTTGA TTTTTAAGTG GCATTCTTTT 1955 

25 CCTTAGGACT TTTTATGTAT ACCTGTTGAT TGTTGTGTAA ATTTTAGTAA 2005 

ATCTAAGAGA GTGTACTAAA CCAGCAGGTA TCTGTTAGCT TATGTGTTTA 2055 

ATTGAAATTA GAAGGCTAAG ATGGTATAAC AGCATTTTAT TGCTTTGTCC 2105 

30 

AGCTACAACA TGTCATTTTT TTCTCCATGT CTTATCTTCC TGTTTCACTT 2155 

TAGTTTATTC TTCGTTTTTT ATTGAGATCT ATAAAAAATT GGCTTACTTA 2205 

35 ATAGCAAATT ACTTGAAGAA TTTGCCTGCT TTATATAAAG TTAGCACTTT 2255 

AAGATTTTTT TTTTAGAGAT GAGAAGACAT TTAAATTGAA GAAAAATTCC 2305 

CCCAGCAATA GACAGTCTAT CAGTCCAAGT ATTTACTTCC TGAGTTTTGA 2355 

40 

TCAATATTTT TTATTTGTGT ATGTTAATCG TCATAAAAAC AGTGATTTTG 2405 
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GTGTGTTTTT TATTTTGGTG CTTTAATGGC TTAAGATGTT GCACATTTTT 2455 

rrrrrcrri T ggtttctgtt tatgtttttt tgcctatgca gttaaatttt 2505 

5 TCCTAGAAAT AGCATTTGTG TTGAACAGTA ACACTTTATA CATATATATA 2555 

TGCATGTTTA TTTTGTTTGG CGTCTTTGGA GGGATGCTTT TAGACTTGTT 2605 

TGCAAAAGGG CAGTTTTCTT TTTCTTTGCT GCAGTTGTCT ATTTTGCAGA 2655 

10 

ATAATAGTGT GTGCAAGTTT GTGAGCAAAT GAAATATGCA GGTTCAATCT 2705 

ATTGATTTTG ATTTTTACAT CTTATATCTA TGCCAGAATC TGTATTTCAT 2755 

15 ATAACTTATT TATTTCGAAT GGATGTAGTA AATTCACAGC TATCAGTTTT 2805 

GATTTTGCAA TAAATAAACC ACTAGGTTGC ATGTCGAACA AATTTTTATC 2855 

TCAAATACCA ACCATCAGTT TTTTTTTTCA TGTGTTTTGG TACAGCTAAT 2905 

20 

TCCTAATTGT AGAGTGTTAA ATGTTTGAGG AGAACCTTTT CTCATAGATG 2955 

GTTGGTGTTC ATATGGCNAC TTTACAATAA AGAGAACTGT AAGTGATATT 3005 

25 TGGAAACTAC AAACCTGGAA TTAGGAGATA TAATTATTCC TTCAAGTTTT 3055 

ATAGATATCA CTTGGGAGAT TCCAAAGCCA TAGCTATTAC GCNGCAAACC 3105 

TAGGATAAGA AAGGTAGTAT GAGTGCTGGT AGACCAGCTG CAACATTTCC 3155 

30 

TATATCAGAT GAAAAAGGCT GGTGAAACAA GTACAGTCCA GATTTTTTAA 3205 

AATCATACTT TCTCAGGGAT CTCCACAAAC TGGTGGGTGT CCTGGCTGTC 3255 

35 TGTGTGATAG CCTCTTTCTA TAGGTGAGGC CTCAAATGAA TTGCAGCTAT 3305 

CCTGGTGTTC CTATGAGGGC ACTTGTATGA AAAAGGCAGT ACTCCAAAAC 3355 

ATTTTTGATG GTTCTTTGGC CAGTTGCCAA AGAGTGTGAA AGAATCCAAT 3405 

40 

AGAGGATTTT TCTTACTGAT AGCAGTCATT CATTGCAGTA AAATAAAATA 3455 
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TGAATTCCCA TTAGGGAATC TTGAATTCTG ACCTCCCATA CTCCGTTTTG 3505 
AAATAACCAC TTATATTTCA TTTTTTAAAA ATCTGATGAT CTCTTTGAGG 3555 
5 CAGGTTTCAG ATTTGGCAGT ACAACATGAA AGATTAGGAA AAGCATTAAT 3605 
AACGTGTGGG TGGAAAGCTT GTTAAAAATC TGAGAGTGAA GTTTGAGTTA 3655 
AAAGTTGTTT GACATGGCAT TGACTGGGAG GCCAAAGATT TAAAGAAGCG 3705 

10 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

20 (x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramirez Jos6, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M. , Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Manuralian Silencer Protein that Restricts 
25 Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

30 (G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 8: FROM 1 TO 24 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



35 



TGYAARCCNT GYCARTAYGA RGCN 



24 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 
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(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tap ia- Ramirez Jos§, Toledo- 
5 Aral, Juan, Zheng, Yingcong, Boutros, Michael C. , Altschuler, 

Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 
10 (D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 9: FROM 1 TO 24 
15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

NGTYTTRTAR TCRCARTGNG GRCA 24 

(2) INFORMATION FOR SEQ ID NO: 10: 
20 (i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 3291 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 

30 (H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 

(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramlrez Jos^, Toledo- 
35 Aral, Juan, Zheng, Yingcong, Boutros, Michael C Altschuler, 

Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel. Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 
40 (D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 
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(G) PATE: March 24, 1995 

(K) REI*EVANT RESIDUES IN SEQ ID NO:10:FROM 1 TO 3291 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

5 ATG GCC ACC GAG GTA ATG GGG CAG TCT TOT GGA GGA GGA GGG CTG 45 
Met Ala Thr Gin Val Met Gly Gin Ser Ser Gly Gly Gly Gly Leu 
15 10 15 

TTT ACC AGC AGT GGC AAC ATT GGA ATG GCC CTG CCT AAC GAC ATG 90 
10 Phe Thr Ser Ser Gly Asn He Gly Met Ala Leu Pro Asn Asp Met 

20 25 30 

TAT GAC TTG CAT GAC CTT TCC AAA GCT GAA CTG GCC GCA CCT CAG 135 
Tyr Asp Leu His Asp Leu Ser Lys Ala Glu Leu Ala Ala Pro Gin 
15 35 40 45 

CTT ATT ATG CTG GCA AAT GTG GCC TTA ACT GGG GAA GTA AAT GGC 160 
Leu lie Met Leu Ala Asn Val Ala Leu Thr Gly Glu Val Asn Gly 
50 55 60 

20 

AGC TGC TGT GAT TAC CTG GTC GGT GAA GAA AGA CAG ATG GCA GAA 225 
Ser Cys Cys Asp Tyr Leu Val Gly Glu Glu Arg Gin Met Ala Glu 
65 70 75 

25 CTG ATG CCG GTT GGG GAT AAC AAC TTT TCA GAT AGT GAA GAA GGA 270 
Leu Met Pro Val Gly Asp Asn Asn Phe Ser Asp Ser Glu Glu Gly 
80 85 90 

GAA GGA CTT GAA GAG TCT GCT GAT ATA AAA GGT GAA CCT CAT GGA 315 
30 Glu Gly Leu Glu Glu Ser Ala Asp He Lys Gly Glu Pro His Gly 

95 100 105 

CTG GAA AAC ATG GAA CTG AGA AGT TTG GAA CTC AGC GTC GTA GAA 360 
Leu Glu Asn Met Glu Leu Arg Ser Leu Glu Leu Ser Val Val Glu 
35 110 115 ^20 

CCT CAG CCT GTA TTT GAG GCA TCA GGT GCT CCA GAT ATT TAC AGT 405 
Pro Gin Pro Val Phe Glu Ala Ser Gly Ala Pro Asp He Tyr Ser 
125 130 135 



40 
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TCA AAT AAA GCT CTT GCC CCT GAA ACA CCT GGA GCG GAG GAC AAA 450 
Ser Asn Lys Ala Leu Ala Pro Glu Thr Pro Gly Ala Glu Asp Lys 
140 145 150 

5 GGC AAG AGC TCG AAG ACC AAA CCC TTT CGC TGT AAG CCA TGC CAA 495 
Gly Lys Ser Ser Lys Thr Lys Pro Phe Arg Cys Lys Pro Cys Gin 
155 160 165 

TAT GAA GCA GAA TCT GAA GAA CAG TTT GTG CAT CAC ATC AGA GTT 540 
10 Tyr Glu Ala Glu Ser Glu Glu Gin Phe Val His His lie Arg Val 

170 175 180 

CAC AGT GCT AAG AAA TTT TTT GTG GAA GAG AGT GCA GAG AAG CAG 585 
His Ser Ala Lys Lys Phe Phe Val Glu Glu Ser Ala Glu Lys Gin 
15 185 190 195 



20 



GCA AAA GCC AGG GAA TCT GGC TCT TCC ACT GCA GAA GAG GGA GAT 630 
Ala Lys Ala Arg Glu Ser Gly Ser Ser Thr Ala Glu Glu Gly Asp 
200 205 210 

TTC TCC AAG GGC CCC ATT CGC TGT GAC CGC TGC GGC TAC AAT ACT 675 
Phe Ser Lys Gly Pro He Arg Cys Asp Arg Cys Gly Tyr Asn Thr 
215 220 225 



25 AAT CGA TAT GAT CAC TAT ACA GCA CAC CTG AAA CAC CAC ACC AGA 
Asn Arg Tyr Asp His Tyr Thr Ala His Leu Lys His His Thr Arg 
230 235 240 



720 



GCT GGG GAT AAT GAG CGA GTC TAC AAG TGT ATC ATT TGC ACA TAC 
30 Ala Gly Asp Asn Glu Arg Val Tyr Lys Cys He He Cys Thr Tyr 

245 250 255 



765 



35 



ACA ACA GTG AGC GAG TAT CAC TGG AGG AAA CAT TTA AGA AAC CAT 810 
Thr Thr Val Ser Glu Tyr His Trp Arg Lys His Leu Arg Asn His 
260 265 270 



TTT CCA AGG AAA GTA TAC ACA TGT GGA AAA TGC AAC TAT TTT TCA 
Phe Pro Arg Lys Val Tyr Thr Cys Gly Lys Cys Asn Tyr Phe Ser 
275 280 285 



855 



40 
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GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT ACA GGA 900 
Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His Thr Gly 
290 295 300 

5 GAA CGC CCA TAT AAA TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 945 
Glu Arg Pro Tyr Lys Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 
305 310 315 

AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT TCA GGT GAG AAG 990 
10 Lys Thr His Leu Thr Arg His Met Arg Thr His Ser Gly Glu Lys 

320 325 330 

CCA TTT AAA TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 1035 
Pro Phe Lys Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 
15 335 340 345 

GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 1080 
Glu Val Thr Arg His Ala Arg Gin Val His Asn Gly Pro Lys Pro 
350 355 360 

20 

CTT AAT TGC CCA CAC TGT GAT TAC AAA ACA GCA GAT AGA AGC AAC 1125 
Leu Asn Cys Pro His Cys Asp Tyr Lys Thr Ala Asp Arg Ser Asn 
365 370 375 

25 TTC AAA AAA CAT GTA GAG CTA CAT GTG AAC CCA CGG CAG TTC AAT 1170 
Phe Lys Lys His Val Glu Leu His Val Asn Pro Arg Gin Phe Asn 
380 385 390 

TGC CCT GTA TGT GAC TAT GCA GCT TCC AAG AAG TGT AAT CTA CAG 1215 
30 Cys Pro Val Cys Asp Tyr Ala Ala Ser Lys Lys Cys Asn Leu Gin 

395 400 405 

TAT CAC TTC AAA TCT AAG CAT CCT ACT TGT CCT AAT AAA ACA ATG 1260 
Tyr His Phe Lys Ser Lys His Pro Thr Cys Pro Asn Lys Thr Met 
35 410 415 420 

GAT GTC TCA AAA GTG AAA CTA AAG AAA ACC AAA AAA CGA GAG GCT 1305 
Asp val Ser Lys Val Lys Leu Lys Lys Thr Lys Lys Arg Glu Ala 
425 430 435 

40 
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GAC TTG CCT GAT AAT ATT ACC AAT GAA AAA ACA GAA ATA GAA CAA 1350 
Asp Leu Pro Asp Asn lie Thr Asn Glu Lys Thr Glu lie Glu Gin 
440 445 450 

5 ACA AAA ATA AAA GGG GAT GTG GCT GGA AAG AAA AAT GAA AAG TCC 1395 
Thr Lys He Lys Gly Asp Val Ala Gly Lys Lys Asn Glu Lys Ser 
455 460 465 

GTC AAA GCA GAG AAA AGA GAT GTC TCA AAA GAG AAA AAG CCT TCT 1440 
10 Val Lys Ala Glu Lys Arg Asp Val Ser Lys Glu Lys Lys Pro Ser 

470 475 480 

AAT AAT GTG TCA GTG ATC CAG GTG ACT ACC AGA ACT CGA AAA TCA 1485 
Asn Asn Val Ser Val He Gin Val Thr Thr Arg Thr Arg Lys Ser 
15 485 490 495 

GTA ACA GAG GTG AAA GAG ATG GAT GTG CAT ACA GGA AGC AAT TCA 1530 
val Thr Glu Val Lys Glu Met Asp Val His Thr Gly Ser Asn Ser 
500 505 510 

20 

GAA AAA TTC AGT AAA ACT AAG AAA AGC AAA AGG AAG CTG GAA GTT 1575 
Glu Lys Phe Ser Lys Thr Lys Lys Ser Lys Arg Lys Leu Glu Val 
515 520 525 

25 GAC AGC CAT TCT TTA CAT GGT CCT GTG AAT GAT GAG GAA TCT TCA 1620 
Asp ser His Ser Leu His Gly Pro Val Asn Asp Glu Glu Ser Ser 
530 535 540 

ACA AAA AAG AAA AAG AAG GTA GAA AGC AAA TCC AAA AAT AAT AGT 1665 
30 Thr Lys Lys Lys Lys Lys Val Glu Ser Lys Ser Lys Asn Asn Ser 

545 550 555 

CAG GAA GTG CCA AAG GGT GAC AGC AAA GTG GAG GAG AAT AAA AAG 1710 
Gin Glu val Pro Lys Gly Asp Ser Lys Val Glu Glu Asn Lys Lys 
35 560 565 570 

CAA AAT ACT TGC ATG AAA AAA AGT ACA AAG AAG AAA ACT CTG AAA 1755 

Gin Asn Thr Cys Met Lys Lys Ser Thr Lys Lys Lys Thr Leu Lys 
575 580 585 

40 
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AAT AAA TCA AGT AAG AAA AGC ACT AAG CCT CCT CAG AAG GAA CCT 1800 
Asn Lys Ser Ser Lys Lys Ser Ser Lys Pro Pro Gin Lys Glu Pro 
590 595 600 

5 GTT GAG AAG GGA TCT GCT CAG ATG GAC CCT CCT CAG ATG GGG CCT 1845 
Val Glu Lys Gly Ser Ala Gin Met Asp Pro Pro Gin Met Gly Pro 
605 €10 615 

GCT CCC ACA GAG GCG GTT CAG AAG GGG CCC GTT CAG GTG GAG CTG 1890 
10 Ala Pro Thr Glu Ala Val Gin Lys Gly Pro Val Gin Val Glu Leu 

620 625 630 

CCA CCT CCC ATG GAG CAT GCT CAG ATG GAG GGT GCC CAG ATA CGG 193 5 
Pro Pro Pro Met Glu His Ala Gin Met Glu Gly Ala Gin lie Arg 
15 635 640 645 

CCT GCT CCT GAC GAG CCT GTT CAG ATG GAG GTG GTT CAG GAG GGG 1980 
Pro Ala Pro Asp Glu Pro Val Gin Met Glu Val Val Gin Glu Gly 
650 655 660 

20 

CCT GCT CAG AAG GAG CTG CTG CCT CCC GTG GAG CCT GCT CAG ATG 2025 
Pro Ala Gin Lys Glu Leu Leu Pro Pro Val Glu Pro Ala Gin Met 
665 670 675 

25 GTG GGT GCC CAA ATT GTA CTT GCT CAC ATG GAG CTG CCT CCT CCC 2070 
Val Gly Ala Gin He Val Leu Ala His Met Glu Leu Pro Pro Pro 
680 685 690 

ATG GAG ACT GCT CAG ACG GAG GTT GCC CAA ATG GGG CCT GCT CCC 2115 
30 Met Glu Thr Ala Gin Thr Glu Val Ala Gin Met Gly Pro Ala Pro 

695 700 705 

ATG GAA CCT GCT CAG ATG GAG GTT GCC CAG GTA GAA TCT GCT CCC 2160 
Met Glu Pro Ala Gin Met Glu Val Ala Gin Val Glu Ser Ala Pro 
35 710 715 720 

ATG CAG GTG GTC CAG AAG GAG CCT GTT CAG ATG GAG CTG TCT CCT 2205 

Met Gin Val Val Gin Lys Glu Pro Val Gin Met Glu Leu Ser Pro 
725 730 735 

40 
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CCC ATG GAG GTG GTC CAG AAG GAG CCT GTT CAG ATA GAG CTG TCT 2250 
Pro Met Glu Val Val Gin Lys Glu Pro Val Gin lie Glu Leu Ser 
740 745 750 

5 CCT CCC ATG GAG GTG GTC CAG AAG GAA CCT GTT AAG ATA GAG CTG 2295 
Pro Pro Met Glu Val Val Gin Lys Glu Pro Val Lys lie Glu Leu 
755 760 765 

TCT CCT CCC ATA GAG GTG GTC CAG AAG GAG CCT GTT CAG ATG GAG 2340 
10 Ser Pro Pro He Glu Val Val Gin Lys Glu Pro Val Gin Met Glu 

770 775 780 

TTG TCT CCT CCC ATG GGG GTG GTT CAG AAG GAG CCT GCT CAG AGG 2385 
Leu Ser Pro Pro Met Gly Val Val Gin Lys Glu Pro Ala Gin Arg 
15 785 790 795 

GAG CCA CCT CCT CCC AGA GAG CCT CCC CTT CAC ATG GAG CCA ATT 2430 

Glu Pro Pro Pro Pro Arg Glu Pro Pro Leu His Met Glu Pro He 
800 805 810 

20 

TCC AAA AAG CCT CCT CTC CGA AAA GAT AAA AAG GAA AAG TCT AAC 2475 

Ser Lys Lys Pro Pro Leu Arg Lys Asp Lys Lys Glu Lys Ser Asn 
815 820 825 

25 ATG CAG AGT GAA AGG GCA CGG AAG GAG CAA GTC CTT ATT GAA GTT 2520 
Met Gin Ser Glu Arg Ala Arg Lys Glu Gin Val Leu lie Glu Val 
830 835 ' 840 

GGC TTA GTG CCT GTT AAA GAT AGC TGG CTT CTA AAG GAA AGT GTA 2565 
30 Gly Leu Val Pro Val Lys Asp Ser Trp Leu Leu Lys Glu Ser Val 

845 850 855 

AGC ACA GAG GAT CTC TCA CCA CCA TCA CCA CCA CTG CCA AAG GAA 2610 
Ser Thr Glu Asp Leu Ser Pro Pro Ser Pro Pro Leu Pro Lys Glu 
35 860 865 870 

AAT TTA AGA GAA GAG GCA TCA GGA GAC CAA AAA TTA CTC AAC ACA 2655 
Asn Leu Arg Glu Glu Ala Ser Gly Asp Gin Lys Leu Leu Asn Thr 
875 880 885 



40 
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GGT GAA GGA AAT AAA GAA GCC CCT CTT CAG AAA GTA GGA GCA GAA 2700 

Gly Glu Gly Asn Lys Glu Ala Pro Leu Gin Lys Val Gly Ala Glu 
890 895 900 

5 GAG GCA GAT GAG AGC CTA CCT GGT CTT GCT GCT AAT ATC AAC GAA 2745 

Glu Ala Asp Glu Ser Leu Pro Gly Leu Ala Ala Asn He Asn Glu 
905 910 915 



TCT ACC CAT ATT TCA TCC TCT GGA CAA AAC TTG AAT ACG CCA GAG 2790 
10 Ser Thr His He Ser Ser Ser Gly Gin Asn Leu Asn Thr Pro Glu 

920 925 930 

GGT GAA ACT TTA AAT GGT AAA CAT CAG ACT GAC AGT ATA GTT TGT 2835 
Gly Glu Thr Leu Asn Gly Lys His Gin Thr Asp Ser He Val Cys 
15 935 940 945 

GAA ATG AAA ATG GAC ACT GAT CAG AAC ACA AGA GAG AAT CTC ACT 2880 

Glu Met Lys Met Asp Thr Asp Gin Asn Thr Arg Glu Asn Leu Thr 
950 955 960 

20 

GGT ATA AAT TCA ACA GTT GAA GAA CCA GTT TCA CCA ATG CTT CCC 2925 

Gly He Asn Ser Thr Val Glu Glu Pro Val Ser Pro Met Leu Pro 
965 970 975 



25 CCT TCA GCA GTA GAA GAA CGT 
Pro Ser Ala Val Glu Glu Arg 
980 

GCA TCA CCT CCT GCT ACA ATG 
30 Ala Ser Pro Pro Ala Thr Met 

995 

GAT GAA GAT GAA GGC ATC CAC 
Asp Glu Asp Glu Gly He His 
35 1010 



GAA GCA GTG TCC AAA ACT GCA CTG 2970 
Glu Ala Val Ser Lys Thr Ala Leu 
985 990 

GCA GCA AAT GAG TCT CAG GAA- ATT 3015 
Ala Ala Asn Glu Ser Gin Glu He 
1000 1005 

AGC CAT GAA GGA AGT GAC CTA AGT 3060 
Ser His Glu Gly Ser Asp Leu Ser 
1015 1020 



GAC AAC ATG TCA GAG GGT AGT GAT GAT TCT GGA TTG CAT GGG GCT 3105 
Asp Asn Met Ser Glu Gly Ser Asp Asp Ser Gly Leu His Gly Ala 
1025 1030 1035 



40 
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CGG CCA GTT CCA CAA GAA TCT AGC AGA AAA AAT GCA AAG GAA GCC 3150 
Arg Pro Val Pro Gin Glu Ser Ser Arg Lys Asn Ala Lys Glu Ala 
1040 1045 1050 

TTG GCA GTC AAA GCG GCT AAG GGA GAT TTT GTT TGT ATC TTC TGT 3195 
Leu Ala Val Lys Ala Ala Lys Gly Asp Phe Val Cys lie Phe Cys 
1055 1060 1065 



GAT CGT TCT TTC AGA AAG GGA AAA GAT TAC AGC AAA CAC CTC AAT 324 0 
10 Asp Arg Ser Phe Arg Lys Gly Lys Asp Tyr Ser Lys His Leu Asn 

1070 1075 1080 

CGC CAT TTG GTT AAT GTG TAC TAT CTT GAA GAA GCA GCT CAA GGG 328 5 
Arg His Leu Val Asn Val Tyr Tyr Leu Glu Glu Ala Ala Gin Gly 
15 1085 1090 1095 



20 



CAG GAG 3291 
Gin Glu 
1097 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 63 base pairs 

25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA to mRNA 

(iii) HYPOTHETICAL: no 
30 (iv) ANTI -SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 
(H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
35 (A) LIBRARY: cDNA 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramlrez JosS, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D. , Mandel, Gail 
40 (B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 
(C) JOURNAL: Cell 
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(D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

5 (K) RELEVANT RESIDUES IN SEQ ID NO: 11: FROM 1 TO 63 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TGT AAG CCA TGC CAA TAT 
Cys Lys Pro Cys Gin Tyr 
10 165 

GAA ,GCA GAA TCT -GAA QAA CAG TT^ -'^Tn -r'HT m^n nTr^ Rr?n r^r,n r-> 

Glu Ala Glu Ser Glu Glu Gin Phe Val His His lie Arg Val His 
170 175 180 

15 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to tnRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 
25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human 

(H) CELL LINE: HeLa 
(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA 
30 (x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramlrez Jos^, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Matnmalian Silencer Protein that Restricts 
35 Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

40 (G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 12: FROM 1 TO 63 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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TGT GAC CGC TGC GGC TAG AAT ACT 
Cys Asp Arg Cys Gly Tyr Asn Thr 
220 225 



5 AAT CGA TAT GAT CAC TAT ACA GCA CAC CTG AAA CAC CAC 63 
Asn Arg Tyr Asp His Tyr Thr Ala His Leu Lys His His 
230 235 

(2) INFORMATION FOR SEQ ID NO: 13: 
10 (i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 63 base pairs 

(B) TVPE : nucieic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 

20 (H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramlrez Jos€, Toledo- 
25 Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 

Yelena M. , Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 
30 (D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 13: FROM 1 TO 63 
35 (Xi) SEQUENCE DESCRIPTION: SEQ ID N0:13: 

TGT ATC ATT TGC ACA TAC 18 
Cys lie lie Cys Thr Tyr 
250 255 

40 
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ACA ACA GTG AGC GAG TAT CAC TGG AGG AAA CAT TTA AGA AAC CAT 63 
Thr Thr Val Ser Glu Tyr His Trp Arg Lys His Leu Arg Asn His 
260 265 270 

5 (2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to TTlRNA 
(iiij HYPOTHETICAL : no 

Civ) ANTI -SENSE: no 
{vi) ORIGINAL SOURCE: 
15 (A) ORGANISM: Human 

(H) CELL LINE: HeLa 
(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA 

(x) PUBLICATION INFORMATION: 

20 (A) AUTHORS: Chong, Jayhong A., Tapi a -Ramirez Jos^, Toledo- 

Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D. , Mandel, Gail 

{B} TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

25 (C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

30 (K) RELEVANT RESIDUES IN SEQ ID NO: 14: FROM 1 TO 63 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



35 



TGT GGA AAA TGC AAC TAT TTT TCA 
Cys Gly Lys Cys Asn Tyr Phe Ser 
280 285 



24 



GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT 63 
Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His 
290 295 

40 

(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS 
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(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA to inRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 

10 (H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 



(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramirez Jose, Toledo- 
15 Aral, Juam, Zheng, Yingcong, Boutros, Michael C, Altschuler, 

Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Cheuinel Gene Expression to Neurons 

(C) JOURNAL: Cell 
20 (D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 15: FROM 1 TO 63 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 30 
Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 




310 



315 



30 



AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT 
Lys Thr His Leu Thr Arg His Met Arg Thr His 
320 325 



63 



35 



(2) INFORMATION FOR SEQ ID NO: 16: 
(i) SEQUENCE CHARACTERISTICS 



(A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 



40 



(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 
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(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human 

(H) CELL LINE: HeLa 
5 (vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A. , Tapia-Ramirez Jos6, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 

10 Yelena M. , Frohman, Michael A., Kraner, Susan D. , Mandel, Gail 

(B) TITI^; REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression -to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 
15 (E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 16: FROM 1 TO 66 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 

20 

TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 3( 
Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 
335 340 345 



25 GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC 66 
Glu Val Thr Arg His Ala Arg Gin Val His 
350 355 



(2) INFORMATION FOR SEQ ID NO: 17: 
30 (i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 

40 (H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 
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(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramirez Jos§, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 
5 (B) TITLE: REST: A Mammalian Silencer Protein that Restricts 

Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 
10 (F) PAGES: 

(G) DATE: March 24, 1995 

(K) P-ELEVAOT P^SIDUES IN SEQ ID NO : IT i FROM 1 TO -63 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17: 

15 TGC CCA CAC TGT GAT TAC AAA ACA GCA GAT AGA AGC AAC 39 
Cys Pro His Cys Asp Tyr Lys Thr Ala Asp Arg Ser Asn 
365 370 375 



TTC AAA AAA CAT GTA GAG CTA CAT 63 
20 Phe Lys Lys His Val Glu Leu His 

380 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS 
25 (A) LENGTH: 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
30 (iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human 

(H) CELL LINE: HeLa . 
35 (vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA 
(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramlrez Jos6, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 

40 Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 
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(C) JOURNAL: Cell 

(D) VOLUME: 80 
(£) ISSUE: 

(F) PAGES: 
5 (G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 18: FROM 1 TO 66 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 

TGC CCT GTA TGT GAC TAT GCA GCT TCC AAG AAG TGT AAT CTA CAG 4 5 
10 Cys Pro Val Cys Asp Tyr Ala Ala Ser Lys Lys Cys Asn Leu Gin 

395 400 405 

TAT CAC TTC AAA TCT AAG CAT 66 
Tyr His Phe Lys Ser Lys His 
15 410 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS 
<A) LENGTH: 441 base pairs 

20 <B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 
25 (iv) ANTI-SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 
(H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
30 (A) LIBRARY: cDNA 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramirez Jos6, Toledo- 
Aral, Juam, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M. , Frohman, Michael A., Kraner, Susan D., Mandel, Gail 
35 (B) TITLE: REST: A Mammalian Silencer Protein that Restricts 

Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 
40 (F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO:20:FROM 1 TO 441 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

ATG GAG GTG GTT CAG GAG GGG 21 
Met Glu Val Val Gin Glu Gly 
5 655 660 

CCT GCT CAG AAG GAG CTG CTG CCT CCC GTG GAG CCT GCT CAG ATG 66 
Pro Ala Gin Lys Glu Leu Leu Pro Pro Val Glu Pro Ala Gin Met 
665 670 675 

10 

GTG GGT GCC CAA ATT GTA CTT GCT CAC ATG GAG CTG CCT CCT CCC 111 

Val Gly Ala Gin lie -Val Leu Ala His Met Glu Leu Pro Pro Pro 

605 690 

15 ATG GAG ACT GCT CAG ACG GAG GTT GCC CAA ATG GGG CCT GCT CCC 156 
Met Glu Thr Ala Gin Thr Glu Val Ala Gin Met Gly Pro Ala Pro 
695 700 705 

ATG GAA CCT GCT CAG ATG GAG GTT GCC CAG GTA GAA TCT GCT CCC 201 
20 Met Glu Pro Ala Gin Met Glu Val Ala Gin Val Glu Ser Ala Pro 

■710 715 720 

ATG CAG GTG GTC CAG AAG GAG CCT GTT CAG ATG GAG CTG TCT CCT 246 
Met Gin Val Val Gin Lys Glu Pro Val Gin Met Glu Leu Ser Pro 
25 725 730 735 

CCC ATG GAG GTG GTC CAG AAG GAG CCT GTT CAG ATA GAG CTG TCT 291 

Pro Met Glu Val Val Gin Lys Glu Pro Val Gin lie Glu Leu Ser 
740 745 . 750 

30 

CCT CCC ATG GAG GTG GTC CAG AAG GAA CCT GTT AAG ATA GAG CTG 336 

Pro Pro Met Glu Val Val Gin Lys Glu Pro Val Lys He Glu Leu 
755 760 765 



35 TCT CCT CCC ATA GAG GTG GTC CAG AAG GAG CCT GTT CAG ATG GAG 
Ser Pro Pro He Glu Val Val Gin Lys Glu Pro Val Gin Met Glu 
770 775 780 



361 
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TTG TCT CCT CCC ATG GGG GTG GTT CA6 AAG GAG CCT GCT CAG AGG 426 
Leu Ser Pro Pro Met Gly Val Val Gin Lys Glu Pro Ala Gin Arg 
785 790 795 



GAG CCA CCT CCT CCC 
Glu Pro Pro Pro Pro 
800 



441 



(2) INFORMATION FOR SEQ ID NO: 21: 
10 (i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: cDNA to tnRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 

20 iH) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 

(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A. , Tapia-Ramirez Jose, Toledo- 
25 Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 

Yelena M., Frohman, Michael A., Kraner, Susan D. , Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 
30 (D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 21: FROM 1 TO 48 
35 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:21: 



ATG GAG GTG GTT CAG GAG GGG 
Met Glu Val Val Gin Glu Gly 
655 660 



21 



40 
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CCT GCT CAG AAG GAG CTG CTG CCT CCC 48 
Pro Ala Gin Lys Glu Leu Leu Pro Pro 
665 

5 (2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
( i"i-i-/ KYPOTKET-XGAL : -no 

(iv) ANTI-SENSE: no 

(vi) ORIGINAL SOURCE: 
15 (A) ORGANISM: Human 

(H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA 

(x) PUBLICATION INFORMATION: 

20 (A) AUTHORS: Chong. Jayhong A,, Tapia- Ramirez Josfi, Toledo- 

Aral, Ju£ui, Zheng, Yingcong, Boutros, Michael C. , Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D. , Mandel, Gail 

(B) TITLE: REST; A Mammalian Silencer Protein that Restricts 
Sodium Chsmnel Gene Expression to Neurons 

25 (C) JOURNAL: Cell 

(D) VOLXJME: 80 

(E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

30 (K) RELEVANT RESIDUES IN SEQ ID NO:22:FROM 1 TO 48 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:22: 

ATG CAG GTG GTC CAG AAG GAG CCT GTT CAG ATG GAG CTG TCT CCT 45 
Met Gin Val Val Gin Lys Glu Pro Val Gin Met Glu Leu Ser Pro 
35 725 730 735 

CCC 48 
Pro 



40 



(2) INFORMATION FOR SEQ ID NO: 23: 
(i) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 48 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear . 

(ii) MOLECULE TYPE: cDNA to inRNA 
5 (iii) HYPOTHETICAL: no 
(iv) ANTI -SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 
(H) CELL LINE: HeLa 
10 (vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 
(x) RUBLICATION , INFORMATION.: 

(A) AUTHORS: Chong, Jayhong A., Tapi a -Ramirez Jos§, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 

15 Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 
20 (E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO:23:FROM 1 TO 46 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO;23: 

25 

ATG GAG GTG GTC CAG AAG GAG CCT GTT CAG ATA GAG CTG TCT 42 
Met Glu Val Val Gin Lys Glu Pro Val Gin He Glu Leu Ser 
740 745 750 

30 CCT CCC 48 
Pro Pro 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS 
35 (A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 
40 (iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Human 
(K) CELL LINE: HeLa 
(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 
5 (x) PXJBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A,, Tapia-Ramlrez Jos6, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, , Altschuler, 
Yelena M. , Frohman, Michael A., Kraner, Susan D. , Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
10 Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 
(B) VGLLT^-: SG 

(E) ISSUE: 

(F) PAGES: 

15 (G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO:24:FROM 1 TO 48 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:24: 



ATG GAG GTG GTC CAG AAG GAA CCT GTT AAG ATA GAG CTG 
20 Met Glu Val Val Gin Lys Glu Pro Val Lys He Glu Leu 
755 760 765 

TCT CCT CCC 
Ser Pro Pro 

25 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
35 (vi) ORIGINAL SOURCE: 

<A) ORGANISM: Human 
(H) CELL LINE: HeLa 
(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 
40 (X) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A,, Tapia-Ramirez Jos4, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
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Yelena M. , Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 
(F> PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 25: FROM 1 TO 48 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

ATA ,GAG GTG -GTC -CAG- AJ^.G GAG CCT GTT CAG A.TG GAG 
lie Glu Val Val Gin Lys Glu Pro Val Gin Met Glu 
"^70 775 780 

TTG TCT CCT CCC 
Leu Ser Pro Pro 

(2) INFORMATION FOR SEQ ID NO: 26: 
20 (i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: Human 

30 (H) CELL LINE: HeLa 

(vii) IMMEDIATE SOURCE: 
(A) LIBRARY; cDNA 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Raml rez Jos^, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 
40 (D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 



35 
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(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 26: FROM 1 TO 48 
(Xi) SEQUENCE DESCRIPTION: SEQ ID. N0:26: 



5 ATG GGG GTG GTT GAG AAG GAG CCT GCT CAG AGG 
Met Gly Val Val Gin Lys Glu Pro Ala Gin Arg 
785 790 795 



GAG CCA CCT CCT CCC 4 
10 Glu Pro Pro Pro Pro 

800 

{2> INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS 

15 (A) LENGTH: 1461 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to tnRNA 
20 (iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human 

(H) CELL LINE: HeLa 
25 (vii) IMMEDIATE SOURCE: 

(A) LIBRARY: cDNA 
<x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A. , Tapia-Ramirez Jos€, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler. 

30 Yelena M., Frohman, Michael A., Kraner, Susan D., Mandel, Gail 

(B) TITLE: REST: A Mammalian Silencer Protein that Restricts 
Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 
35 (E) ISSUE: 

(F) PAGES: 

(G) DATE: March 24, 1995 

(K) RELEVANT RESIDUES IN SEQ ID NO: 26: FROM 1 TO 1461 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:27: 

40 
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CTG GCC GCA CCT CAG 15 
Leu Ala Ala Pro Gin 
45 

5 CTT ATT ATG CTG GCA AAT GTG GCC TTA ACT GGG GAA GTA AAT GGC 60 
Leu lie Met Leu Ala Asn Val Ala Leu Thr Gly Glu Val Asn Gly 
50 55 60 



10 



15 



20 



AGC TGC TGT GAT TAG CTG GTC GGT GAA GAA AGA CAG ATG GCA GAA 105 
Ser Cys Cys Asp Tyr Leu Val Gly Glu Glu Arg Gin Met Ala Glu 
65 70 75 

CTG ATG CCG GTT GGG GAT AAC AAC TTT TCA GAT AGT GAA GAA GGA 150 
Leu Met Pro Val Gly Asp Asn Asn Phe Ser Asp Ser Glu Glu Gly 
80 85 90 

GAA GGA CTT GAA GAG TCT GCT GAT ATA AAA GGT GAA CCT CAT GGA 195 
Glu Gly Leu Glu Glu Ser Ala Asp He Lys Gly Glu Pro His Gly 
95 100 105 

CTG GAA AAC ATG GAA CTG AGA AGT TTG GAA CTC AGC GTC GTA GAA 240 
Leu Glu Asn Met Glu Leu Arg Ser Leu Glu Leu Ser Val Val Glu 
110 115 120 

25 CCT CAG CCT GTA TTT GAG GCA TCA GGT GCT CCA GAT ATT TAC AGT 285 
Pro Gin Pro Val Phe Glu Ala Ser Gly Ala Pro Asp He Tyr Ser 
125 130 ' 135 

TCA AAT AAA GCT CTT GCC CCT GAA ACA CCT GGA GCG GAG GAG AAA 330 
30 Ser Asn Lys Ala Leu Ala Pro Glu Thr Pro Gly Ala Glu Asp Lys 

140 145 150 

GGC AAG AGC TCG AAG ACC AAA CCC TTT CGC TGT AAG CCA TGC CAA 375 
Gly Lys Ser Ser Lys Thr Lys Pro Phe Arg Cys Lys Pro Cys Gin 
35 155 160 165 

TAT GAA GCA GAA TCT GAA GAA CAG TTT GTG CAT CAC ATC AGA GTT 420 

Tyr Glu Ala Glu Ser Glu Glu Gin Phe Val His His He Arg Val 
170 175 180 

40 
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CAC AGT GCT AAG AAA TTT TTT GTG GAA GAG AGT GCA GAG AAG CAG 465 
His Ser Ala Lys Lys Phe Phe Val Glu Glu Ser Ala Glu Lys Gin 
185 190 

5 GCA AAA GCC AGG GAA TCT GGC TCT TCC ACT GCA GAA GAG GGA GAT 510 
Ala Lys Ala Arg Glu Ser Gly Ser Ser Thr Ala Glu Glu Gly Asp 
200 205 210 

TTC TCC AAG GGC CCC ATT CGC TGT GAC CGC TGC GGC TAC AAT ACT 555 
10 Phe Ser Lys Gly Pro He Arg Cys Asp Arg Cys Gly Tyr Asn Thr 

215 220 225 

AAT CGA TAT GAT CAC TAT ACA GCA CAC CTG AAA CAC CAC ACC AGA 600 
Asn Arg Tyr Asp His Tyr Thr Ala His Leu Lys His His Thr Arg 
15 230 235 240 

GCT GGG GAT AAT GAG CGA GTC TAC AAG TGT ATC ATT TGC ACA TAC 645 

Ala Gly Asp Asn Glu Arg Val Tyr Lys Cys He He Cys Thr Tyr 
245 250 255 

20 

ACA ACA GTG AGC GAG TAT CAC TGG AGG AAA CAT TTA AGA AAC CAT 690 

Thr Thr Val Ser Glu Tyr His Trp Arg Lys His Leu Arg Asn His 
260 265 270 

25 TTT CCA AGG AAA GTA TAC ACA TGT GGA AAA TGC AAC TAT TTT TCA 735 
Phe Pro Arg Lys Val Tyr Thr Cys Gly Lys Cys Asn Tyr Phe Ser 
275 280 ■ 285 

GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT AQA GGA 780 
30 Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His Thr Gly 

290 295 300 

GAA CGC CCA TAT AAA TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 825 
Glu Arg Pro Tyr Lys Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 
35 305 310 315 

AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT TCA GGT GAG AAG 870 
Lys Thr His Leu Thr Arg His Met Arg Thr His Ser Gly Glu Lys 
320 325 330 

40 
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CCA TTT AAA TGT GAT GAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 915 
Pro Phe Lys Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 
335 340 345 

5 GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 960 
Glu Val Thr Arg His Ala Arg Gin Val His Asn Gly Pro Lys Pro 
350 355 360 

CTT AAT TGC CCA CAC TGT GAT TAC AAA ACA GCA GAT AGA AGC AAC 1005 
10 Leu Asn Cys Pro His Cys Asp Tyr Lys Thr Ala Asp Arg Ser Asn 

365 370 375 

TTC AAA AAA CAT GTA GAG CTA CAT GTG AAC CCA CGG CAG TTC AAT 1050 
Phe Lys Lys His Val Glu Leu His Val Asn Pro Arg Gin Phe Asn 
380 385 390 

TGC CCT GTA TGT GAC TAT GCA GCT TCC AAG AAG TGT AAT CTA CAG 1095 

Cys Pro Val Cys Asp Tyr Ala Ala Ser Lys Lys Cys Asn Leu Gin 
395 400 405 

20 

TAT CAC TTC AAA TCT AAG CAT CCT ACT TGT CCT AAT AAA ACA ATG 1140 

Tyr His Phe Lys Ser Lys His Pro Thr Cys Pro Asn Lys Thr Met 
410 415 420 

25 GAT GTC TCA AAA GTG AAA CTA AAG AAA ACC AAA AAA CGA GAG GCT 1185 
Asp Val Ser Lys Val Lys Leu Lys Lys Thr Lys Lys Arg Glu Ala 
425 430 435 



GAC TTG CCT GAT AAT ATT ACC AAT GAA AAA ACA GAA ATA GAA CAA 1230 
30 Asp Leu Pro Asp Asn He Thr Asn Glu Lys Thr Glu He Glu Gin 

440 445 450 

ACA AAA ATA AAA GGG GAT GTG GCT GGA AAG AAA AAT GAA AAG TCC 1275 
Thr Lys He Lys Gly Asp Val Ala Gly Lys Lys Asn Glu Lys Ser 
35 455 460 465 



GTC AAA GCA GAG AAA AGA GAT GTC TCA AAA GAG AAA AAG CCT TCT 1320 
Val Lys Ala Glu Lys Arg Asp Val Ser Lys Glu Lys Lys Pro Ser 
470 475 480 
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AAT AAT GTG TCA GTG ATC CAG GTG ACT ACC AGA ACT CGA AAA TCA 1365 
Asn Asn Val Ser Val He Gin Val Thr Thr Arg Thr Arg Lys Ser 
485 490 495 

5 GTA ACA GAG GTG AAA GAG ATG GAT GTG CAT ACA GGA AGC AAT TCA 1410 
Val Thr Glu Val Lys Glu Met Asp Val His Thr Gly Ser Asn Ser 
500 505 510 

GAA AAA TTC AGT AAA ACT AAG AAA AGC AAA AGG AAG CTG GAA GTT 1455 
10 Glu Lys Phe Ser Lys Thr Lys Lys Ser Lys Arg Lys Leu Glu Val 

515 520 525 



15 



GAC AGC 14gl 
Asp Ser 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 1284 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI- SENSE: no 
25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human 
(H) CELL LINE: HeLa 
(vii) IMMEDIATE SOURCE: 
(A) LIBRARY: cDNA 
30 (x) PUBLICATION INFORMATION: 

(A) AUTHORS: Chong, Jayhong A., Tapia-Ramlrez Jos€, Toledo- 
Aral, Juan, Zheng, Yingcong, Boutros, Michael C, Altschuler, 
Yelena M., Frohman, Michael A., Kraner, Susan D. , Mandel, Gail 

(B) TITLE: REST: A Manunalian Silencer Protein that Restricts 
35 Sodium Channel Gene Expression to Neurons 

(C) JOURNAL: Cell 

(D) VOLUME: 80 

(E) ISSUE: 

(F) PAGES: 

40 (G) DATE: March 24, 1995 

<K) RELEVANT RESIDUES IN SEQ ID NO:26:FROM 1 TO 1284 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
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TCT TCT GGA GGA GGA GGG CTG 21 
Ser Ser Gly Gly Gly Gly Leu 
10 15 

5 TTT ACC AGC AGT GGC AAC ATT GGA ATG GCC CTG CCT AAC GAC ATG 66 
Phe Thr ser Ser Gly Aan He Gly Met Ala Leu Pro Asn Asp Met 
20 25 30 

TAT GAC TTG CAT GAC CTT TCC AAA GCT GAA CTG GCC GCA CCT CAG 111 
10 Tyr Asp Leu His Asp Leu Ser Lys Ala Glu Leu Ala Ala Pro Gin 

35 40 45 
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CTT ATT ATG CTG GCA AAT GTG GCC TTA ACT GGG GAA GTA AAT GGC 156 
Leu He Met Leu Ala Asn Val Ala Leu Thr Gly Glu Val Asn Gly 

50 55 , 60 

AGC TGC TGT GAT TAC CTG GTC GGT GAA GAA AGA CAG ATG GCA GAA 201 
Ser Cys Cys Asp Tyr Leu Val Gly Glu Glu Arg Gin Met Ala Glu 
65 70 75 

CTG ATG CCG GTT GGG GAT AAC AAC TTT TCA GAT AGT GAA GAA GGA 246 
Leu Met Pro Val Gly Asp Asn Asn Phe Ser Asp Ser Glu Glu Gly 
80 85 90 

25 GAA GGA CTT GAA GAG TCT GCT GAT ATA AAA GGT GAA CCT CAT GGA 291 
Glu Gly Leu Glu Glu Ser Ala Asp He Lys Gly Glu Pro His Gly 
95 100 105 



20 



30 



CTG GAA AAC ATG GAA CTG AGA AGT TTG GAA CTC AGC GTC GTA GAA 336 
Leu Glu Asn Met Glu Leu Arg Ser Leu Glu Leu Ser Val Val Glu 
110 115 120 



CCT CAG CCT GTA TTT GAG GCA TCA GGT GCT CCA GAT ATT TAC AGT 381 
Pro Gin Pro Val Phe Glu Ala Ser Gly Ala Pro Asp He Tyr Ser 
35 125 130 135 



40 



TCA AAT AAA GCT CTT GCC CCT GAA ACA CCT GGA GCG GAG GAC AAA 
Ser Asn Lys Ala Leu Ala Pro Glu Thr Pro Gly Ala Glu Asp Lys 
140 145 150 
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GGC AAG AGC TCG AAG ACC AAA CCC TTT CGC TGT AAG CCA TGC CAA 471 

Gly Lys Ser Ser Lys Thr Lys Pro Phe Arg Cys Lys Pro Cys Gin 

155 160 165 

5 TAT GAA GCA GAA TCT GAA GAA GAG TTT GTG CAT CAC ATC AGA GTT 516 
Tyr Glu Ala Glu Ser Glu Glu Gin Phe Val His His He Arg Val 
170 175 180 

CAC AGT GCT AAG AAA TTT TTT GTG GAA GAG AGT GCA GAG AAG CAG 561 
10 His Ser Ala Lys Lys Phe Phe Val Glu Glu Ser Ala Glu Lys Gin 

185 190 195 

GCA AAA GCC AGG GAA TCT GGC TCT TCC ACT GCA GAA GAG GGA GAT 606 
Ala Lys Ala Arg Glu Ser Gly Ser Ser Thr Ala Glu Glu Gly Asp 
15 200 205 210 

TTC TCC AAG GGC CCC ATT CGC TGT GAC CGC TGC GGC TAC AAT ACT 651 
Phe Ser Lys Gly Pro He Arg Cys Asp Arg Cys Gly Tyr Asn Thr 
215 220 225 

20 

AAT CGA TAT GAT CAC TAT ACA GCA CAC CTG AAA CAC CAC ACC AGA 696 
Asn Arg Tyr Asp His Tyr Thr Ala His Leu Lys His His Thr Arg 
230 235 240 

25 GCT GGG GAT AAT GAG CGA GTC TAC AAG TGT ATC ATT TGC ACA TAC 741 
Ala Gly Asp Asn Glu Arg Val Tyr Lys Cys He He Cys Thr Tyr 
245 250 ' 255 

ACA ACA GTG AGC GAG TAT CAC TGG AGG AAA CAT TTA AGA AAC CAT 786 
30 Thr Thr Val Ser Glu Tyr His Trp Arg Lys His Leu Arg Asn His 

260 265 270 

TTT CCA AGG AAA GTA TAC ACA TGT GGA AAA TGC AAC TAT TTT TCA 831 
Phe Pro Arg Lys Val Tyr Thr Cys Gly Lys Cys Asn Tyr Phe Ser 
35 275 280 285 



GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT ACA GGA 876 
Asp Arg Lys Asn Asn Tyr Val Gin His Val Arg Thr His Thr Gly 
290 295 300 
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GAA CGC CCA TAT AAA TGT GAA CTT TGT CCT TAC TCA AGT TCT CAG 921 

Glu Arg Pro Tyr Lys Cys Glu Leu Cys Pro Tyr Ser Ser Ser Gin 
305 310 315 

5 AAG ACT CAT CTA ACT AGA CAT ATG CGT ACT CAT TCA GGT GAG AAG 966 
Lys Thr His Leu Thr Arg His Met Arg Thr His Ser Gly Glu Lys 
320 325 330 

CCA TTT AAA TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 1011 
10 Pro Phe Lys Cys Asp Gin Cys Ser Tyr Val Ala Ser Asn Gin His 

335 340 345 

GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 1056 
Glu Val Thr Arg His Ala Arg Gin Val His Asn Gly Pro Lys Pro 
15 350 355 360 

CTT AAT TGC CCA CAC TGT GAT TAC AAA ACA GCA GAT AGA AGC AAC 1101 
Leu Asn Cys Pro His Cys Asp Tyr Lys Thr Ala Asp Arg Ser Asn 
365 370 375 

20 

TTC AAA AAA CAT GTA GAG CTA CAT GTG AAC CCA CGG CAG TTC AAT 1146 
Phe Lys Lys His Val Glu Leu His Val Asn Pro Arg Gin Phe Asn 
380 385 390 

25 TGC CCT GTA TGT GAC TAT GCA GCT TCC AAG AAG TGT AAT CTA CAG 1191 
Cys Pro Val Cys Asp Tyr Ala Ala Ser Lys Lys Cys Asn Leu Gin 
395 400 405 

TAT CAC TTC AAA TCT AAG CAT CCT ACT TGT CCT AAT AAA ACA ATG 1236 
30 Tyr His Phe Lys Ser Lys His Pro Thr Cys Pro Asn Lys Thr Met 

410 415 420 

GAT GTC TCA AAA GTG AAA CTA AAG AAA ACC AAA AAA CGA GAG GCT 1281 
Asp Val Ser Lys Val Lys Leu Lys Lys Thr Lys Lys Arg Glu Ala 
35 425 430 435 

GAC 1284 
Asp 



40 



(2) INFORMATION FOR SEQ ID NO: 29: 
(i) SEQUENCE CHARACTERISTICS 
(A) LENGTH: 28 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
5 (iii) HYPOTHETICAL: no 
(iv) ANTI- SENSE: no 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: rat 

(vii) IMMEDIATE SOURCE: 
10 (A) LIBRARY: Genomic 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Maue, R.A. , ^Kraner, Goodman, R.H., Mandel, -Gail 

(B) TITLE: REST: Neuron -Specific Expression of the Rat Brain 
Type II Sodium Channel Gene Is Directed by Upstream Regulatory 

IS Elements 

(C) JOURNAL: Neuron 

(D) VOLUME: 4 



(P) PAGES: 223-231 
20 (G) DATE: February, 1990 

(K) RELEVANT RESIDUES IN SEQ ID NO: 29: FROM 1 TO 28 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATTGGGTTTC AGAACCACGG ACAGCACC 2 

25 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI -SENSE: no 
35 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Rat 
(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Genomic 

(X) PUBLICATION INFORMATION: 
40 (A) AUTHORS: Maue, R.A., Kraner, Goodman, R.H., Mandel, Gail 

(B) TITLE: REST: Neuron-Specif ic Expression of the Rat Brain 
Type II Sodium Channel Gene Is Directed by Upstream Regulatory 
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Elements 

(C) JOURNAL: Neuron 

(D) VOLUME: 4 

(F) PAGES: 223-231 

(G) DATE: February, 1990 

(K) RELEVANT RESIDUES IN SEQ ID NO: 30: FROM 2353 TO 2400 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 



10 ATTGGGGGGA CGAACCACGG ACAGCACC 



28 
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What is claimed is: 

1 LA substantially pure nucleic acid comprising a nucleic acid encoding a protein 

2 having at least about 85% homology to at least the DNA binding domain or the suppressor 

3 domain of an animal REST protein. 

1 2. The substantially pure nucleic acid of claim 1» comprising a nucleic acid encoding 

2 at least the DNA binding domain or the suppressor domain of an animal REST protein. 

1 3. The substantially pure nucleic acid of claim 2, wherein the REST protein is a 

2 mammalian REST protein. 

1 4. The substantially pure nucleic acid of claim 3, wherein the REST protein is a 

2 human REST protein. 

1 5. The substantially pure nucleic acid of claim 4, wherein the nucleic acid comprises 

2 SEQIDN0:2. 

1 6. The substantially pure nucleic acid of claim 5, wherein the nucleic acid comprises 

2 SEQIDNO:I0. 



1 7. The substantially pure nucleic acid of claim 1, comprising a nucleic acid encoding 

2 both the DNA binding domain and the suppressor domain of an animal REST protein. 

1 8. The substantially pure nucleic acid of claim 7, wherein the REST protein is a 

2 manunalian REST protein. 

1 9. The substantially pure nucleic acid of claim 8, wherein the REST protein is a 

2 human REST protein. 



1 
2 



10. The substantially pure nucleic acid of claim 9, wherein the nucleic acid comprises 
SEQ ID N0;2. 
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1 11. The substantially pure nucleic acid of claim 10, wherein the nucleic acid comprises 

2 SEQIDNOilO. 

1 12. The substantially pure nucleic acid of claim 1, comprising a nucleic encoding a 

2 protein differing from an animal REST protein by no more than about 20 point mutations. 

1 13. A substantially pure nucleic acid that hybridizes with an animal REST nucleic acid 

2 under stringent conditions. 



1 14. The substantially pure nucleic acid of claim 13, comprising the nucleic acid of 

2 SEQIDNO:!. 

1 15. A substantially pure nucleic acid comprising a nucleic acid encoding a protein that 

2 binds to a promoter having at least about 90% homology to nucleotides 6-28 of SEQ ID NO:29 

3 and acting to suppress the activity of a promoter having said promoter. 

1 16. A substantially pure protein having at least about 85% homology with at least the 

2 DNA binding domain or the suppressor domain of an animal REST protein. 

1 17. The substantially pure protein of claim 16, comprising at least the DNA binding 

2 domain or the suppressor domain of an animal REST protein, 

1 18. The substantially pure protein of claim 17, comprising the protein of SEQ ID 

2 N0:2. 

1 19. The substantially pure protein of claim 18, comprising both the DNA binding 

2 domain and the suppressor domain of an animal REST protein. 

1 20. The substantially pure protein of claim 19, comprising the protein of SEQ ID 

2 NO: 10. 
3 
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1 21. A transformed eukaryoiic or prokaryoiic cell comprising a nucleic acid encoding a 

2 protein having at least about 85% homology lo at least one of the DNA binding domain or the 

3 suppressor domain of an animal REST protein. 

1 22. The transformed cell of claim 21 comprising a nucleic acid encoding at least the 

2 DNA binding domain or the suppressor domain of an animal REST protein. 

1 23. The transformed cell of claim 22, wherein the REST protein is a mammalian 

2 REST protein. 

1 24. The transformed cell of claim 23, wherein the REST protein is a human REST 

2 protein. 

1 25. The transformed cell of claim 24, wherciti the nucleic acid comprises SEQ ID 

2 NO:2. 



1 26. A vector capable of reproducing in a eukaryotic or prokaryotic cell comprising a 

2 nucleic acid encoding a protein having at least about 85% homology to at least the DNA 

3 binding domain or the suppressor domain of an animal REST protein, 

1 27. The vector capable of reproducing in a eukaryotic or prokaryotic cell of claim 26, 

2 comprising a nucleic acid encoding at least the DNA binding domain or the suppressor domain 

3 of an animal REST protein. 

1 28. The vector capable of reproducing in a eukaryotic or prokaryotic cell of claim 27, 

2 wherein the REST protein is a manunalian REST protein. 

1 29. The vector capable of reproducing in a eukaryotic or prokaryotic cell of claim 28, 

2 wherein the REST protein is a himian REST protein. 

1 30. The vector capable of reproducing in a eukaryotic or prokaryotic cell of claim 29, 

2 wherein the nucleic acid comprises SEQ ID N0:2. 
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1 31. A method of preparing a protein having REST activity, wherein the protein has at 

2 least about 85% homology with at least the DNA binding domain or the suppressor domain of 

3 an animal REST protein, the method comprising: 

4 (a) transforming an appropriate eukaryotic or prokaryotic cell with an 

5 expression vector for expressing intracellularly or extracellularly a nucleic acid encoding the 

6 protein; 

7 (b) growing the transformed cell in culture; and 

8 (c) isolating the protein from the transformed cell or the culture medium. 

1 32. A pharmaceutical compositibh for treating an animal fiaving de-differenfiai«i 

2 neural cells or neural cells exhibiting diminished activity conq)rising an effective amount of a 

3 REST-inierfering nucleic acid, wherein the REST-interfering nucleic acid comprises an 

4 antisense molecule directed against REST expression or an expression vector for expressing 

5 REST DNA binding activity but not REST silencer activity, and a pharmaceutically acceptable 

6 carrier. 



1 33. The pharmaceutical composition of claim 32, wherein the animal has brain cancer. 

1 34. The pharmaceutical composition of claim 32, wherein said animal has a 

2 demyelinating myasthenia gravis, muscular dystrophy, bomlism, peripheral neuropathies, 

3 traumatic nerve injury, post stroke degeneration, post-traumatic spinal and neural degeneration, 

4 poliomyelitis or rabies. 

1 35. A pharmaceutical composition for an animal having neural cells exhibiting 

2 excessive neural activity comprising an effective amount of an expression vector comprising a 

3 nucleic acid encoding a protein that inhibits the expression of neural proteins in non-neural 

4 tissues, and a pharmaceutically acceptable carrier. 

1 36. The pharmaceutical composition of claim 35, wherein the animal has epilepsy, 

2 Lennox-Gasuut syndrome, spasticity, trauma-induced pain, schizophrenia, stroke or a 

3 neurodegenerative disease. 
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1 37. The phannaceutical composition of claim 36, wherein the animal has Alzheimer's, 

2 Parkinson's or Huntington's disease. 

1 38. The pharmaceutical composition of claim 36, wherein the animal has epilepsy. 

1 39. The pharmaceutical composition of claim 36. wherein the animal has a 

2 neurodegenerative disease. 

1 40. A method of determining the level of REST expression in a tissue sample 

2 con4>nsuig: 

3 (a) contacting the tissue sample with (i) a nucleic acid that binds to REST 

4 mRNA under stringent conditions or (ii) an amibody specific for REST; 

5 (b) washing the tissue sample to remove non-specific hybridizations of the 

6 nucleic acid or non-specific antibody binding; and 

7 (c) determining the level of hybridized nucleic acid or bound antibody. 

1 41. An antibody that reacts specifically with the substantially pure protein of claim 16. 

1 42. A pair of PGR primers enable of directing the amplification of the substantially 

2 pure nucleic acid of claim 1. 
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Fig. 1 

(Part 1 of 6) 

ATCTGGCGCG GCGTAGCCCT GTGTTGGAAT GTGCGGCTGC CGCGAGCTCG -275 

CGGCGCAGCA GCGGAGCGAG CGCCGCCGAG GCCCGGGGCC CCAGACCCTG -225 

GCGGCGGCTG CGGCAGCCGA GACGGCAGGG CGAGGCCCGG AGGCCTGAGC -175 

ACCCTCTGCA GCCCCACTCC TGGGCCTTCT TGGTCCACGA CGGCCCCAGC -125 

ACCCAACTTT ACCACCCTCC CCCACCTCTC CCCCGAAACT CCAGCAACAA -75 

AGAAAAGTAG TCGGAGAAGG AGCGGCGACT CAGGGTCGCC CGCCCCTCCT -25 

CACCGAGGAA GGCCGAATAC AGTT -1 
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CTG 


45 


Met Ala Thr Gin Val Met Gly Gin Ser 


Ser 
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90 
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Ala- 
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CTT ATT 
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Part 2 of 6 

JJf IfS S?^ "^^T CGC TGC GGC TAC AAT ACT 675 

Phe Ser Lys Gly Pro He Arg Cvs Asp Ara cvs aiv Tyr- ^en 

215 220 225 

AAT CGA TAT GAT CAC TAT. ACA GCA CAC CTG AAA CAC CAC ACC AGA 720 

Agn Arg Tvr Asp His Tvr Thr Ala His Leu T.ys His Hi ^ Thr Arq 
™ 235 240 

GCT GGG GAT AAT GAG CGA GTC TAC AAG TGT ATC ATT TGC ACA TAC 765 

Ala Gly Asp Asn Glu Arg Val Tyr Lys Cvs lie Tie Cvs Thr Tvr 

245 250 255 

S^? 2?^ '^^'^ ^ CAT TTA AGA AAC CAT 810 

Thr Tnr Val $er Glu Tvr His Tm a r g Lvs Vi^ T.eu Ara Asn Hi^ 

260 2S5 270 

TTT CCA AGG AAA GTA TAC ACA TGT GGA AAA TGC AAC TAT TTT TCA 855 
Phe Pro Arg Lys Val Tyr Thr Cvs Glv T.v.c: Cvs A5.n Tvr Ph. ■«;.^ 

2_75 2 BO _9fle; 

GAC AGA AAA AAC AAT TAT GTT CAG CAT GTT AGA ACT CAT ACA GGA 900 
Agp Ayq Lys Asn Asn Tvr Val Gin H j g Val Aro Thr ff S c Thr Gly 

^ I^*^ ^ ^ TGT CCT TAC TCA AGT TCT CAG 945 

Glu Arg Pro Tyr Lys Cvs Glu Leu Cvs Prn T yr Ser Ser s^r r,1n 

305 310 315 

i?" ATG CGT ACT CAT TCA GGT GAG AAG 990 

I^y? Tny Hys L?V Th y Arg His M et Ar a Thr Hi« ser Gly Glu Lys 

320 325 330 

CCA m AAA TGT GAT CAG TGC AGT TAT GTG GCC TCT AAT CAA CAT 1035 
Pro Phe Lys Cys Asp Gin Cvs Ser Tvr Val Ser Asn Gin His 

335 340 345 

GAA GTA ACC CGC CAT GCA AGA CAG GTT CAC AAT GGG CCT AAA CCT 1080 
QlV Thr Arg Hi? Ala Arg Gin Val His Asn Gly Pro Lys Pro 

. 350 355 2€0 

err AAT TGC CCA CAC TGT GAT TAC AAA ACA GCA GAT AGA AGC AAC 1125 
Asn Cy? Pro His Cvs Asp Tvr Lvs Thr aI:. asd Aro ser- A^n 

370 375 
TTC AAA AAA CAT GTA GAG CTA CAT GTG AAC CCA CGG CAG TTC AAT 1170 
Phe Lys T.vs His Val Glu Leu His Val Asn Pro Arg Gin Phe Asn 

3B0 385 390 

TGC CCT GTA TGT GAC TAT GCA GCT TCC AAG AAG TGT AAT CTA CAG 1215 
Cyg Pro Val Cvs Asp Tvr Ala Ala Ser T.vs Lvs Cvs Asn Leu Gin 

395 400 405 

TAT CAC TTC AAA TCT AAG CAT CCT ACT TGT CCT AAT AAA ACA ATG 1260 
Tvr His Phe Lvs Ser Lvs His Pro Thr Cys Pro Asn Lys Thr Met 

410 415 420 

GAT GTC TCA AAA GTG AAA CTA AAG AAA ACC AAA AAA CGA GAG GCT 1305 
Asp Val Ser Lys Val Lys Leu Lys Lys Thr Lys Lys Arg Glu Ala 

425 430 435 

GAC TTG CCT GAT AAT ATT ACC AAT GAA AAA ACA GAA ATA GAA CAA 1350 
Asp Leu Pro Asp Asn lie Thr Asn Glu Lys Thr Glu He Glu Gin 

440 445 450 
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Fig. 1 
Part 3 of 6 

SIJ ^ S i?. i 

470 475 AQn 

SIl If^ SI? ?r f7? *F CGA AAA ?a 1485 

Asn Asn Val Ser Val lie Gin Val Thr Thr Arg Thr Arg Lys Ser 

485 490 A Qc 

52 ?S ?I? Its SI? ?I? H=?I ?s =^ ^I ^2 

500 505 CI n 

GAA AAA TTC AGT AAA ACT AAG AAA AGC AAA AGG AAG CTG GAA GTT 1575 
Glu Lys Phe ser Lys Thr Lys Lys Ser Lys Arg Su v^l 

ter Ss 12 Sf S^^ °AT GAG GAA TCT 1620 

Asp Ser Hxs Ser Leu His Gly Pro Val Asn Asp Glu Glu Ser Ser 

530 535 ^An 

ACA AAA AAG AAA AAG AAG GTA GAA AGC AAA TCC AAA AAT AAT ACT 1665 
Thr Lys Lys Lys Lys Lys Val Glu Ser Lys Ser "I ^I 

545 550 ccc 

CAG GAA GTG CCA AAG GGT GAC AGC AAA GTG GAG GAG AAT AAA AAG 1710 
Gin Glu val Pro Lys Gly Asp Ser Lys Val Glu Glu Asn Lys Lys 

CAA AAT ACT TGC ATG AAA AAA AGT ACA AAG AAG AAA ACT CTG AAA 1755 
Gin Asn Thr Cys Met Lys Lys Ser Thr Lys Lys Lys tS Leu 

^ l"^ ill f ^ CAG AAG GAA CCT 1800 

Asn Lys Ser Ser Lys Lys Ser Ser Lys Pro Pro Gin Lys Glu Pro 

590 595 
GTT GAG AAG GGA TCT GCT CAG ATG GAC CCT CCT CAG ATG GGG CCT 1845 
Val Glu Lys Gly Ser Ala Gin Met Asp Pro Pro Gin Met Gl? P^ 

SOS 610 ' 

Pro tS g?u lit SIT ^ GTG GAG CTG 1890 

Ala Pro Thr Glu Ala Val Gin Lys Gly Pro Val Gin Val Glu Leu 

620 625 g3Q 

CCA CCT CCC ATG GAG CAT GCT CAG ATG GAG GGT GCC CAG ATA CGG 193 5 
Pro Pro Pro Met Glu His Ala Gin Met Glu Gly Sa Gin lie 

^35 . 640 645 

pS ffl l^'^ A'^G GAG GTG GTT CAG GAG GGG 1980 

Pro Ala Pro Asp Glu Pro Val Gin Met Glu Val vai r,ln nin r.w 

?fl GTG CCT CCC CTG GAG CCT GCT CAG Itc 2025 

Pro Ala Gin Lvs Glu L^n Leu Pro Prn Val Glu Pro Ala Gin Met 

665 670 675 

GTG GGT GCC CAA ATT GTA CTT GCT CAC ATG GAG CTG CCT CCT CCC 2070 
Val Gly Ala Gin He Val Leu Ala His Met Glu Leu Pro Pro Pro 

680 685 690 
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ATG GAG 
Met Glu 

ATG GAA 
Met Glu 



ACG GAG GTT 
Thr Glu Val 



GCC CAA 
Ala Gin 
700 
GCC CAG 



ATG GGG 
Met Gly 

GTA GAA 
Val Glu 



CCT GCT 
Pro Ala 



TCT GCT 
Ser Ala 




TCT CCT 
Ser Pro 

TTG TCT 

■i&u Ser 

GAG CCA 
Glu Pro 



CCC ATA GAG 
^ro He Glu 
770 

CCT CCC ATG 
Pro Pro MP^ 
785 

CCT CCT CCC 
Pro Pro Pro 



TCC AAA 
Ser Lys 

ATG CAG 
Met Gin 

GGC TTA 
Gly Leu 

AGC ACA 
Ser Thr 

AAT TTA 
Asn Leu 

GGT GAA 
Gly Glu 

GAG GCA 
Glu Ala 

TCT ACC 
Ser Thr 



600 

AAG CCT CCT 
Lys Pro Pro 
815 

AGT GAA AGG 
Ser Glu Arg 
830 

GTG CCT GTT 
Val Pro Val 
845 

GAG GAT CTC 
Glu Asp Leu 
860 

AGA GAA GAG 
Arg Glu Glu 
875 

GGA AAT AAA 
Gly Asn Lys 
690 

GAT GAG AGC 
Asp Glu Ser 
905 

CAT ATT TCA 
His He Ser 
920 



GTG GTC CAG 
Val Val Rln 

GGG GTG GTT 
rlv Val Val 

AGA GAG CCT 
Arg Glu Pro 

CTC CGA AAA 
Leu Arg Lys 

GCA CGG AAG 
Ala Arg Lys 

AAA GAT AGC 
Lys Asp Ser 

TCA CCA CCA 
Ser Pro Pro 

GCA TCA GGA 
Ala Ser Gly 

GAA GCC CCT 
Glu Ala Pro 

CTA CCT GGT 
Leu Pro Gly 

TCC TCT GGA 
Ser Ser Gly 



/O'U 

AAG GAG 
Lvs Glu 
775 
CAG AAG 
Gin Ly s 
790 
CCC CTT 
Pro Leu 
805 
GAT AAA 
Asp Lys 
820 
GAG CAA 
Glu Gin 
835 
TGG CTT 
Trp Leu 
850 
TCA CCA 
Ser Pro 
865 
GAC CAA 
Asp Gin 
880 
CTT CAG 
Leu Gin 
895 
CTT GCT 
Leu Ala 
910 
CAA AAC 
Gin Asn 
925 



CCT GTT 
Pro Val 

GAG CCT 

Glu P] 

CAC ATG 
His Met 

AAG GAA 
Lys Glu 

GTC CTT 
Val Leu 

CTA AAG 
Leu Lys 

CCA CTG 
Pro Leu 

AAA TTA 
Lys Leu 

AAA GTA 
Lys Val 



CAG ATG 
Gin Met 

GCT CAG 
Ala Gin 

GAG CCA 
Glu Pro 

AAG TCT 
Lys Ser 

ATT GAA 
He Glu 

GAA AGT 
Glu Ser 

CCA AAG 
Pro Lys 

CTC AAC 
Leu Asn 

GGA GCA 
Gly Ala 



GCT AAT 
Ala Asn 

TTG AAT 
Leu Asn 



ATC AAC 
lie Asn 

ACG CCA 
Thr Pro 



CCC 
Pro 
705 
CCC 
Pro 
720 
CCT 
Pro 
735 
TCT 
Ser 
750 
CTG 
Leu 
765 
GAG 
Glu 
780 
AGG 
Ar< 
795 
ATT 
He 
810 
AAC 
Asn 
825 
GTT 
Val 
840 
GTA 
Val 
855 
GAA 
Glu 
870 
ACA 
Thr 
885 
GAA 
Glu 
900 
GAA 
Glu 
915 
GAG 
Glu 
930 



2115 
2160 
2205 
2250 
2295 
2340 
2385 
2430 
2475 
2520 
2565 
2610 
2655 
2700 
2745 
2790 
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J^P ^ ACT GAC AGT ATA GTT TGT 2835 

Gly Glu Thr Leu Asn Gly Lys His Gin Thr Asp Ser He Val Cys 
;■• 935 940 945 

GAA ATG AAA ATG GAC ACT GAT CAG AAC ACA AGA GAG AAT CTC ACT 2880 
Glu Met. Lys Met Asp Thr Asp Gin Asn Thr Arg Glu Asn Leu Thr 

950 955 ago 

GGT ATA AAT TCA ACA GTT GAA GAA CCA GTT TCA CCA ATG CTT CCC 2925 
Gly He Asn Ser Thr Val Glu Glu Pro Val Ser Pro Met Leu Pro 

5S5 970 975 

pS t?" S^?" ^^'^ TCC AAA ACT GCA CTG -2970 

Pro Ser Ala Val Glu Glu Arg Glu Ala Val Ser Lys Thr Ala Leu 

985 990 

GCA TCA CCT CCT GCT ACA ATG GCA GCA AAT GAG TCT CAG GAA ATT 3015 

Ala Ser Pr.o Pro Ala Thr Met Ala .Ala Asn- Glu Ser -Gin Glu lie- 

555 1000 1005 

m r^. f"'^ 2?^ tT^ ^'^ ^ GGA AGT GAC CTA AGT 3 060 

Asp Glu Asp Glu Gly He His Ser His Glu Gly Ser Asp Leu Ser 

1010 1015 1020 

GAC AAC ATG TCA GAG GGT AGT GAT GAT TCT GGA TTG CAT GGG GCT 3105 
Asp Asn Met Ser Glu Gly Ser Asp Asp Ser Gly Leu His Gly Ala 

1025 1030 1035 

CGG CCA GTT CCA CAA GAA TCT AGC AGA AAA AAT GCA AAG GAA GCC 3150 
Arg Pro Val Pro Gin Glu Ser Ser Arg Lys Asn Ala Lys Glu Ala 

1040 1045 1050 

TTG GCA GTC AAA GCG GCT AAG GGA GAT TTT GTT TGT ATC TTC TGT 3195 
Leu Ala Val Lys Ala Ala Lys Gly Asp Phe Val Cys He Phe Cys 

1055 1060 1065 

GAT CGT TCT TTC AGA AAG GGA AAA GAT TAC AGC AAA CAC CTC AAT 324 0 
Asp Arg Ser Phe Arg Lys Gly Lys Asp Tyr Ser Lys His Leu Asn 

1070 1075 1080 

CGC CAT TTG GTT AAT GTG TAC TAT CTT GAA GAA GCA GCT CAA GGG 3285 
Arg Hxs Leu Val Asn Val Tyr Tyr Leu Glu Glu Ala Ala Gin Gly 

1085 1090 1095 

CAG GAG TAATG AAACTTTGAA CAAGGTTTCA GTTCTTAGTT 3326 
Gin Glu 
1097 

TGTAAGGTAT ATTACATTTT ATATTCATTT ATGATAGCAG ACAACCTTTT 3376 
AAGATTGCTT TAATTAGTAT CTGATGTTGA TTTTTAAGTG GCATTCTTTT 3426 
CCTTAGGACT TTTTATGTAT ACCTGTTGAT TGTTGTGTAA ATTTTAGTAA 34 76 

ATCTAAGAGA GTGTACTAAA CCAGCAGGTA TCTGTTAGCT TATGTGTTTA 3526 
ATTGAAATTA GAAGGCTAAG ATGGTATAAC AGCATTTTAT TGCTTTGTCC 3576 
AGCTACAACA TGTCATTTTT TTCTCCATGT CTTATCTTCC TGTTTCACTT 3626 
TAGTTTATTC TTCGTTTTTT ATTGAGATCT ATAAAAAATT GGCTTACTTA 3676 
ATAGCAAATT ACTTGAAGAA TTTGCCTGCT TTATATAAAG TTAGCACTTT 3726 
AAGATTTTTT TTTTAGAGAT GAGAAGACAT TTAAATTGAA GAAAAATTCC 3776 
CCCAGC AATA GACAGTCTAT CAGTCCAAGT ATTTACTTCC TGAGTTTTGA 3826 
TCAATATTTT TTATTTGTGT ATGTTAATCG TCATAAAAAC AGTGATTTTG 3876 
GTGTGTTTTT TATTTTGGTG CTTTAATGGC TTAAGATGTT GCACATTTTT 3926 
TTTTTCTTTT GGTTTCTGTT TATGTTTTTT TGCCTATGCA GTTAAATTTT 3976 
TCCTAGAAAT AGCATTTGTG TTGAACAGTA ACACTTTATA CATATATATA 4026 



5/6 



wo 96/29433 



PCT/US96/03940 



Fig. 1 
Part 6 of 6 

TGCATGTTTA TTTTGTTTGG 
TGCAAAAGGG CAGTTTTCTT 
ATAATAGtGT GTGCAAGTTT 
ATTGATTTTG ATTTTTACAT 
AT AACT TATT TATTTCGAAT 
GATTTTGCAA TAAATAAACC 
TCAAATACCA ACCATCAGTT 
TCCTAATTGT AGAGTGTTAA 
GTTGGTGTTC ATATGGCNAC 
TGGAAACTAC AAACCTGGAA 
ATAGATATCA CTTGGGAGAT 
TAGGATAAGA AAGGTAGTAT 
TATATCAGAT GAAAAAGGCT 
JiSiTCJiTACT^ TGTGAGGG?^ 
TGTGTGATAG CCTCTTTCTA 
C CTGGT GTTC CTATGAGGGC 
ATTTTTGATG GTTCTTTGGC 
AGAGGATTTT TCTTACTGAT 
TGAATTCCCA TTAGGGAATC 
AAATAACCAC TTATATTTCA 
CAGGTTTCAG ATTTGGCAGT 
AACGTGTGGG TGGAAAGCTT 
AAAGTTGTTT GACATGGCAT 
GA AGAT TCTT CTCTTAAGAC 
TGTTTTGTGT GCATGAATGG 
CAATCATTGT CAACAGAAGA 



CGTCTTTGGA GGGATGCTTT 
TTTCTTTGCT. GCAGTTGTCT 
GTGAGCAAAT GAAATATGCA 
CTTATATCTA TGCCAGAATC 
GGATGTAGTA AATTCACAGC 
ACTAGGTTGC ATGTCGAACA 
TTTTTTTTCA TGTGTTTTGG 
ATGTTTGAGG AGAACCTTTT 
TTTACAATAA AGAGAACTGT 
TTAGGAGATA TAATTATTCC 
TCCAAAGCCA TAGCTATTAC 
GAGTGCTGGT AGACCAGCTG 
GGTGAAACAA GTACAGTCCA 
CTGGAGAAAG- -TGGTGGGTST 
TAGGTGAGGC CTCAAATGAA 
ACTTGTATGA AAAAGGCAGT 
CAGTTGCCAA AGAGTGTGAA 
AGCAGTCATT CATTGCAGTA 
TTGAATTCTG ACCTCCCATA 
TTTTTTAAAA ATCTGATGAT 
ACAACATGAA AGATTAGGAA 
GTTAAAAATC TGAGAGTGAA 
TGACTGGGAG GCCAAAGATT 
ATGAGGAGTA AGTTGTGTGA 
ACATTGTAAA TGTTGAATTC 
TAAAGCTGCA AATATTTATG 



TAGACTTGTT 


4076 


ATTTTGCAGA 


4126 


GGTTCAATCT 


4176 


TGTATTTCAT 


4226 


TATCAGTTTT 


4276 


AATTTTTATC 


4326 


TACAGCTAAT 


4376 


CTCATAGATG 


4426 


AAGTGATATT 


4476 


TTCAAGTTTT 


4526 


GCNGCAAACC 


4576 


CAACATTTCC 


4626 


GATTrriTAA 


4676 


-SSTSGSTGTG 




TTGCAGCTAT 


4776 


ACTCCAAAAC 


4826 


AGAATCCAAT 


4876 


AAATAAAATA 


4926 


CTCCGTTTTG 


4976 


CTCTTTGAGG 


5026 


AAGCATTAAT 


5076 


GTTTGAGTTA 


5126 


TAAAGAAGCG 


5176 


TAATGGTATG 


5226 


TAGGCTCCGA 


5276 


TTTTAAAA 


5324 
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